Skip to content

Instantly share code, notes, and snippets.

@fry69
Created September 18, 2024 06:04
Show Gist options
  • Save fry69/06379fd3deb8c1db960b0270420087b9 to your computer and use it in GitHub Desktop.
Save fry69/06379fd3deb8c1db960b0270420087b9 to your computer and use it in GitHub Desktop.
o1-mini rewrting gcm.sh

The provided gcm shell function is an innovative tool that leverages an AI-powered language model (LLM) to generate concise Git commit messages based on staged changes. This automation enhances workflow efficiency by reducing the cognitive load associated with crafting commit messages manually. Below is a comprehensive analysis of the existing code, along with detailed recommendations for optimizations and improvements to both the codebase and user experience.


1. Code Analysis and Current Functionality

Overview

The gcm function performs the following steps:

  1. Diff Extraction: Retrieves the current staged changes (git diff --cached) while excluding specific lock files (pnpm-lock.yaml, package-lock.json, yarn.lock).
  2. AI Integration: Sends the extracted diff to an LLM (specifically claude-3-haiku) via the llm CLI tool to generate a concise, one-line commit message with appropriate prefixes (feat:, fix:, doc:, chore:).
  3. User Interaction: Presents the generated commit message to the user, offering options to accept, edit, regenerate, or cancel the commit.
  4. Commit Execution: Executes the commit based on user input, providing feedback on the success or failure of the operation.

Strengths

  • Automation: Streamlines the commit message creation process using AI, saving time and ensuring consistency.
  • User Control: Offers flexibility by allowing users to accept, edit, regenerate, or cancel the generated commit message.
  • Exclusion Handling: Prevents unnecessary files (lock files) from influencing commit messages, maintaining relevance.

2. Code Optimizations and Improvements

a. Dependency Checks

Issue: The function assumes that both git and the llm CLI tool are installed and available in the user's environment. If either is missing, the function will fail without a clear indication of the missing dependency.

Recommendation: Add checks to verify the presence of required commands before proceeding. This enhances robustness and provides clear error messages to users.

Implementation:

# At the beginning of the gcm function
if ! command -v git &> /dev/null; then
    echo "Error: git is not installed. Please install git to use the gcm function."
    return 1
fi

if ! command -v llm &> /dev/null; then
    echo "Error: llm CLI tool is not installed. Please install it from https://llm.datasette.io/en/stable/"
    return 1
fi

b. Handling Empty Staged Changes

Issue: If there are no staged changes, the function proceeds to generate a commit message, which is unnecessary and may lead to confusion.

Recommendation: Incorporate a check to determine if there are staged changes before attempting to generate a commit message.

Implementation:

# After dependency checks
if git diff --cached --quiet; then
    echo "No staged changes to commit."
    return 1
fi

c. Generalizing File Exclusions

Issue: Currently, the function excludes specific lock files. As projects evolve, other files might also need exclusion, making the approach less flexible.

Recommendation: Allow users to define a customizable list of exclusion patterns. This can be achieved by setting an environment variable or passing arguments to the function.

Implementation:

# Define a default list of excluded patterns
GCM_EXCLUDED_PATTERNS=( ':!pnpm-lock.yaml' ':!package-lock.json' ':!yarn.lock' )

# Allow user to extend or override exclusions via an environment variable
if [ -n "$GCM_EXCLUDED_PATTERNS_USER" ]; then
    IFS=';' read -r -a USER_EXCLUSIONS <<< "$GCM_EXCLUDED_PATTERNS_USER"
    GCM_EXCLUDED_PATTERNS+=( "${USER_EXCLUSIONS[@]}" )
fi

# Use the exclusion patterns in git diff
git diff --cached -- . "${GCM_EXCLUDED_PATTERNS[@]}" | llm ...

Users can set GCM_EXCLUDED_PATTERNS_USER="':!*.md':!*.txt'" in their shell configuration to add more exclusions.

d. Enhanced Error Handling for LLM Integration

Issue: The function does not handle potential failures from the llm command, such as network issues or API errors, which can lead to unhandled exceptions.

Recommendation: Monitor the exit status of the llm command and handle errors gracefully, providing informative messages to the user.

Implementation:

commit_message=$(generate_commit_message)
if [ $? -ne 0 ] || [ -z "$commit_message" ]; then
    echo "Error: Failed to generate commit message using LLM."
    return 1
fi

e. Expanding Commit Message Prefixes

Issue: The current prefixes (feat:, fix:, doc:, chore:) are limited. Conventional Commits allow for a broader range of prefixes, which can provide more granular context.

Recommendation: Update the system prompt to include additional prefixes such as style:, refactor:, perf:, test:, ci:, build:, and revert:. This enables the LLM to categorize commits more accurately.

Implementation:

--system "
Below is a diff of all staged changes, coming from the command:

\`\`\`
git diff --cached
\`\`\`

Please generate a concise, one-line commit message for these changes.
Prefix the commit message appropriately based on the content of the diffs with one of the following:

- **feat:** for new features
- **fix:** for bug fixes
- **docs:** for documentation changes
- **chore:** for maintenance tasks that do not modify src or test files
- **style:** for code style changes (e.g., formatting, missing semi-colons)
- **refactor:** for code changes that neither fix a bug nor add a feature
- **perf:** for performance improvements
- **test:** for adding or modifying tests
- **ci:** for changes to CI configurations
- **build:** for changes to the build system or external dependencies
- **revert:** to revert previous commits

Reply only with the one-line, short commit message, no reasoning, no additional text.
Your reply must be suitable to be used directly as a commit message in git."

f. Properly Escaping Commit Messages

Issue: Commit messages containing special characters (e.g., quotes) might break the git commit -m command, leading to unintended behavior or errors.

Recommendation: Ensure that commit messages are properly escaped to handle special characters safely.

Implementation: Use printf with the %q format specifier to escape special characters:

escaped_message=$(printf '%q' "$commit_message")
if git commit -m "$escaped_message"; then
    ...
fi

Alternatively, use single quotes if the commit message does not contain single quotes.

g. Improving the read_input Function

Issue: The read_input function handles only Bash and Zsh shells. It may not work as expected in other shell environments, potentially limiting portability.

Recommendation: While supporting all possible shells is complex, adding a fallback mechanism or enhancing compatibility can improve robustness.

Implementation:

read_input () {
    prompt="$1"
    if [ -n "$ZSH_VERSION" ] || [ -n "$BASH_VERSION" ]; then
        read -p "$prompt" -r REPLY
    else
        # Fallback for other shells
        echo -n "$prompt"
        IFS= read -r REPLY
    fi
}

3. Enhancements to the LLM Prompt

a. Clarity and Precision

Issue: While the current prompt is clear, specifying more detailed instructions can help the LLM generate even more accurate commit messages.

Recommendation: Refine the prompt to provide explicit guidance on formatting, length, and context. Include examples if necessary.

Implementation:

--system "
You are a helpful assistant that generates concise, one-line Git commit messages based on the provided diffs.

Below is a diff of all staged changes, derived from the command:

\`\`\`
git diff --cached
\`\`\`

Please generate a clear, concise, and descriptive one-line commit message that appropriately categorizes the changes using one of the following prefixes:

- **feat:** new feature for the user
- **fix:** bug fix
- **docs:** documentation changes
- **style:** changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons)
- **refactor:** code changes that neither fix a bug nor add a feature
- **perf:** code changes that improve performance
- **test:** adding missing tests or correcting existing tests
- **ci:** changes to CI configuration files and scripts
- **build:** changes that affect the build system or external dependencies
- **chore:** other changes that don't modify src or test files
- **revert:** reverts a previous commit

Ensure the commit message:
- Starts with the appropriate prefix.
- Is in the imperative mood (e.g., "Add feature" not "Added feature" or "Adding feature").
- Does not exceed 72 characters.

Reply only with the one-line commit message, without any additional text, explanations, or line breaks.
"

b. Supporting Multiple Commit Types

Issue: Some commits may involve multiple types of changes, warranting compound prefixes (e.g., feat:ui for a UI feature).

Recommendation: Allow the LLM to use scoped prefixes to provide additional context.

Implementation: Incorporate scoped prefixes into the prompt:

... 
Use scoped prefixes to provide additional context when necessary, such as:
- **feat(ui):** adding a new user interface feature
- **fix(api):** fixing an issue in the API
...

4. User Experience Enhancements

a. Visual Improvements with Colors and Formatting

Issue: Plain text prompts and messages can be hard to parse quickly, especially in multi-step interactions.

Recommendation: Incorporate ANSI color codes to highlight different parts of the interface, enhancing readability and user experience.

Implementation:

# Define color variables
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
CYAN='\033[0;36m'
NC='\033[0m' # No Color

# Usage examples
echo -e "${CYAN}Generating AI-powered commit message...${NC}"
echo -e "\n${YELLOW}Proposed commit message:${NC}"
echo -e "${GREEN}$commit_message${NC}"

b. Informative Messages and Feedback

Issue: Users may encounter states (e.g., no staged changes, LLM failures) without clear explanations.

Recommendation: Provide detailed and user-friendly messages that guide the user on next steps when issues arise.

Implementation: As shown in previous sections, add descriptive error messages and confirmations, ensuring users understand the application's state.

c. Configurability and Customization

Issue: A one-size-fits-all approach may not cater to all users' preferences or project requirements.

Recommendation: Allow users to customize various aspects of the function, such as excluded files, LLM models, and commit message conventions.

Implementation:

  • Environment Variables: Enable users to set preferences via environment variables.
    export GCM_LLM_MODEL="claude-3-haiku"
    export GCM_EXCLUDED_PATTERNS_USER="':!*.md':!*.txt'"
  • Function Parameters: Modify the gcm function to accept optional flags or parameters for customization.

d. Enhanced Interaction Options

Issue: Limited interaction options might not accommodate all user needs, such as skipping AI generation.

Recommendation: Expand the options presented to the user, such as bypassing AI and entering a commit message manually, or integrating with Git's default commit message editor.

Implementation:

# In the options case statement
case "$choice" in
    a|A )
        # Existing accept logic
        ;;
    e|E )
        # Existing edit logic
        ;;
    r|R )
        # Existing regenerate logic
        ;;
    c|C )
        # Existing cancel logic
        ;;
    m|M )
        # New option to open the default editor
        git commit
        return 0
        ;;
    s|S )
        # New option to skip AI and enter message manually
        read_input "Enter your commit message: "
        commit_message=$REPLY
        if [ -n "$commit_message" ] && git commit -m "$commit_message"; then
            echo -e "${GREEN}Changes committed successfully with your message!${NC}"
            return 0
        else
            echo -e "${RED}Commit failed. Please check your message and try again.${NC}"
            return 1
        fi
        ;;
    * )
        # Existing invalid choice logic
        ;;
esac

Update the prompt accordingly to include new options (m for editor, s for skip AI).

e. Support for Multiple Shells

Issue: The current read_input function is tailored for Bash and Zsh, limiting its portability.

Recommendation: While full compatibility with all shells is challenging, improving the fallback mechanism can enhance usability across diverse environments.

Implementation: As detailed in the read_input improvement, provide a general fallback when specific shell versions are not detected.

f. Feedback on Commit Status

Issue: Users receive minimal feedback post-commit, especially on failures.

Recommendation: Enhance feedback mechanisms to include more details on commit status, aiding in troubleshooting.

Implementation:

if git commit -m "$escaped_message"; then
    echo -e "${GREEN}✔ Changes committed successfully!${NC}"
else
    echo -e "${RED}✖ Commit failed. Please check your changes and try again.${NC}"
    return 1
fi

5. Additional Features and Considerations

a. Multi-line Commit Messages

Issue: Some commits may require more detailed descriptions beyond a single line.

Recommendation: Allow the generation and inclusion of multi-line commit messages, optionally.

Implementation: Modify the prompt to allow multi-line messages and adjust the commit command to handle them.

# Adjust the LLM prompt to allow a short description followed by an optional body
--system "
...
Please generate a concise commit message following the Conventional Commits specification, which may include a short summary and an optional detailed description separated by a blank line.
...
"

# When accepting the commit message
if git commit -m "$commit_message"; then
    ...
fi
# For multi-line, use:
git commit -m "$(echo "$commit_message" | head -n1)" -m "$(echo "$commit_message" | tail -n +2)"

b. Logging and Debugging

Issue: Troubleshooting failures in AI generation or commit operations can be challenging without detailed logs.

Recommendation: Implement logging mechanisms to record actions, responses from the LLM, and errors.

Implementation:

# Define a log file
GCM_LOG_FILE="$HOME/.gcm.log"

# Log actions and errors
echo "$(date) - Generating commit message..." >> "$GCM_LOG_FILE"
commit_message=$(generate_commit_message 2>> "$GCM_LOG_FILE")
if [ $? -ne 0 ] || [ -z "$commit_message" ]; then
    echo "Error: Failed to generate commit message using LLM." | tee -a "$GCM_LOG_FILE"
    return 1
fi

c. Rate Limiting and API Usage Tracking

Issue: Frequent use of the LLM may lead to exceeding API rate limits or incurring unexpected costs.

Recommendation: Implement rate limiting within the function and track usage to help users manage their AI interactions responsibly.

Implementation: This can be complex and may require external tools or services to track and limit usage. Alternatively, inform users about potential costs and recommend monitoring their usage separately.


6. Revised Function Incorporating Recommendations

Below is a refined version of the gcm function that integrates several of the aforementioned optimizations and improvements for enhanced robustness and user experience.

# -----------------------------------------------------------------------------
# Enhanced AI-powered Git Commit Function with Optimizations and UX Improvements
# Add this to your ~/.bashrc or ~/.zshrc to utilize the `gcm` command.
# Features:
# 1) Validates required dependencies.
# 2) Checks for staged changes.
# 3) Allows customizable exclusion patterns.
# 4) Generates commit messages using an LLM with an extended prompt.
# 5) Provides user-friendly prompts with color-coded messages.
# 6) Handles errors gracefully with informative feedback.
# -----------------------------------------------------------------------------

gcm () {
    # Define color codes for enhanced readability
    RED='\033[0;31m'
    GREEN='\033[0;32m'
    YELLOW='\033[1;33m'
    CYAN='\033[0;36m'
    NC='\033[0m' # No Color

    # Check for required commands
    if ! command -v git &> /dev/null; then
        echo -e "${RED}Error: git is not installed. Please install git to use the gcm function.${NC}"
        return 1
    fi

    if ! command -v llm &> /dev/null; then
        echo -e "${RED}Error: llm CLI tool is not installed. Please install it from https://llm.datasette.io/en/stable/${NC}"
        return 1
    fi

    # Check for staged changes
    if git diff --cached --quiet; then
        echo -e "${YELLOW}No staged changes to commit.${NC}"
        return 1
    fi

    # Define exclusion patterns
    GCM_EXCLUDED_PATTERNS=( ':!pnpm-lock.yaml' ':!package-lock.json' ':!yarn.lock' )
    if [ -n "$GCM_EXCLUDED_PATTERNS_USER" ]; then
        IFS=';' read -r -a USER_EXCLUSIONS <<< "$GCM_EXCLUDED_PATTERNS_USER"
        GCM_EXCLUDED_PATTERNS+=( "${USER_EXCLUSIONS[@]}" )
    fi

    # Function to generate commit message
    generate_commit_message() {
        git diff --cached -- . "${GCM_EXCLUDED_PATTERNS[@]}" | llm --no-log --model "${GCM_LLM_MODEL:-claude-3-haiku}" --system "
You are a helpful assistant that generates concise, one-line Git commit messages based on the provided diffs.

Below is a diff of all staged changes, derived from the command:

\`\`\`
git diff --cached
\`\`\`

Please generate a clear, concise, and descriptive one-line commit message that appropriately categorizes the changes using one of the following prefixes:

- **feat:** new feature for the user
- **fix:** bug fix
- **docs:** documentation changes
- **style:** code style changes (e.g., formatting, missing semi-colons)
- **refactor:** code changes that neither fix a bug nor add a feature
- **perf:** performance improvements
- **test:** adding or modifying tests
- **ci:** changes to CI configuration files and scripts
- **build:** changes that affect the build system or external dependencies
- **chore:** maintenance tasks that do not modify src or test files
- **revert:** to revert previous commits

Ensure the commit message:
- Starts with the appropriate prefix.
- Is in the imperative mood (e.g., "Add feature" not "Added feature" or "Adding feature").
- Does not exceed 72 characters.

Reply only with the one-line commit message, without any additional text, explanations, or line breaks."
    }

    # Function to read user input compatibly with both Bash and Zsh
    read_input () {
        prompt="$1"
        if [ -n "$ZSH_VERSION" ] || [ -n "$BASH_VERSION" ]; then
            read -p "$prompt" -r REPLY
        else
            # Fallback for other shells
            echo -n "$prompt"
            IFS= read -r REPLY
        fi
    }

    # Log file
    GCM_LOG_FILE="${GCM_LOG_FILE:-"$HOME/.gcm.log"}"

    # Main script
    echo -e "${CYAN}Generating AI-powered commit message...${NC}"
    echo "$(date) - Generating commit message..." >> "$GCM_LOG_FILE"
    commit_message=$(generate_commit_message 2>> "$GCM_LOG_FILE")

    # Check if commit_message was successfully generated
    if [ $? -ne 0 ] || [ -z "$commit_message" ]; then
        echo -e "${RED}Error: Failed to generate commit message using LLM.${NC}"
        echo "$(date) - Failed to generate commit message." >> "$GCM_LOG_FILE"
        return 1
    fi

    while true; do
        echo -e "\n${YELLOW}Proposed commit message:${NC}"
        echo -e "${GREEN}$commit_message${NC}"

        read_input "Do you want to (a)ccept, (e)dit, (r)egenerate, (m)anual, or (c)ancel? [a/e/r/m/c]: "
        choice=$REPLY

        case "$choice" in
            a|A )
                # Escape commit message
                escaped_message=$(printf '%q' "$commit_message")
                if git commit -m "$escaped_message"; then
                    echo -e "${GREEN}✔ Changes committed successfully!${NC}"
                    echo "$(date) - Committed with message: $commit_message" >> "$GCM_LOG_FILE"
                    return 0
                else
                    echo -e "${RED}✖ Commit failed. Please check your changes and try again.${NC}"
                    echo "$(date) - Commit failed." >> "$GCM_LOG_FILE"
                    return 1
                fi
                ;;
            e|E )
                read_input "Enter your commit message: "
                commit_message=$REPLY
                if [ -n "$commit_message" ]; then
                    escaped_message=$(printf '%q' "$commit_message")
                    if git commit -m "$escaped_message"; then
                        echo -e "${GREEN}✔ Changes committed successfully with your message!${NC}"
                        echo "$(date) - Committed with user-provided message: $commit_message" >> "$GCM_LOG_FILE"
                        return 0
                    else
                        echo -e "${RED}✖ Commit failed. Please check your message and try again.${NC}"
                        echo "$(date) - Commit failed with user message." >> "$GCM_LOG_FILE"
                        return 1
                    fi
                else
                    echo -e "${RED}✖ Empty commit message. Please try again.${NC}"
                fi
                ;;
            r|R )
                echo -e "${CYAN}Regenerating commit message...${NC}"
                echo "$(date) - Regenerating commit message..." >> "$GCM_LOG_FILE"
                commit_message=$(generate_commit_message 2>> "$GCM_LOG_FILE")
                if [ $? -ne 0 ] || [ -z "$commit_message" ]; then
                    echo -e "${RED}Error: Failed to regenerate commit message using LLM.${NC}"
                    echo "$(date) - Failed to regenerate commit message." >> "$GCM_LOG_FILE"
                    return 1
                fi
                ;;
            m|M )
                echo -e "${CYAN}Opening your default editor to compose a commit message...${NC}"
                git commit
                return $?
                ;;
            c|C )
                echo -e "${YELLOW}Commit cancelled.${NC}"
                echo "$(date) - Commit cancelled by user." >> "$GCM_LOG_FILE"
                return 1
                ;;
            * )
                echo -e "${RED}Invalid choice. Please try again.${NC}"
                ;;
        esac
    done
}

Key Enhancements in the Revised Function:

  1. Dependency and State Checks: Verifies the presence of git and llm, and ensures there are staged changes before proceeding.
  2. Customizable Exclusions: Allows users to extend exclusion patterns via the GCM_EXCLUDED_PATTERNS_USER environment variable.
  3. Extended Commit Prefixes: Incorporates a broader range of prefixes following the Conventional Commits specification.
  4. Robust Error Handling: Captures and logs errors from the LLM and Git commands, providing clear feedback to the user.
  5. Enhanced User Prompts: Utilizes color-coded messages for improved readability and user guidance.
  6. Additional Interaction Options: Introduces manual commit message entry and the ability to use Git's default editor.
  7. Logging Mechanism: Records actions and outcomes to a log file (~/.gcm.log) for auditing and troubleshooting purposes.
  8. Secure Commit Message Handling: Properly escapes commit messages to handle special characters safely.

7. Conclusion

The gcm function is a powerful tool that, when optimized and enhanced, can significantly streamline the Git commit process by integrating AI-generated messages. The recommended improvements focus on enhancing robustness, flexibility, and user experience, ensuring that the tool is both reliable and adaptable to various user preferences and project requirements. By implementing these suggestions, users can achieve a more efficient and error-resistant workflow, ultimately fostering better project documentation and collaboration practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment