This script generates a comprehensive project context dump, including your project's structure, file contents, and optionally Git history, enabling detailed analysis by AI chatbot models (e.g., ChatGPT, Bard, Bing Chat, Claude, etc.). The generated context file (DUMP_FROM_AICD_<datetime>.txt
) is timestamped to ensure uniqueness.
NOTE: Compatible with Mac and Unix-style command-line environments. Requires standard Unix utilities and potentially jq
and git
. Windows PowerShell support is not currently available.
- Directory Tree: Outputs your project's directory structure using
tree
if available (attempts to respectAICD_ignore
by name), falling back tofind
. - File Contents: Dumps contents of text-based files (source code, config files, etc.).
- Intelligent Truncation & Handling:
- CSV Files (
.csv
): Only the firstMAX_CSV_LINES
(default 10) lines are included, and each of those lines is truncated toMAX_CSV_LINE_CHARS
(default 1000) characters. - JSON Files (
.json
): Only the firstMAX_JSON_CHARS
(default 2000) characters are included. - Jupyter Notebooks (
.ipynb
):- Uses
jq
(if installed, highly recommended) to reliably process the notebook's JSON structure. - Replaces embedded image data (e.g.,
image/png
,image/jpeg
) with the placeholder<image_data_removed>
. - Truncates long text outputs (
text/plain
,text/html
,stream
) within notebook cells toMAX_IPYNB_OUTPUT_CHARS
(default 1000) characters. - Falls back to
sed
for basic image removal ifjq
is not found (less reliable, memory-intensive for large notebooks).
- Uses
- Log Files (
.log
): Large log files are truncated to show only the firstMAX_LOG_HEAD_LINES
(default 50) and the lastMAX_LOG_TAIL_LINES
(default 50) lines. - Large Text Files: Generic text files larger than 2MB are truncated, showing the head and tail.
- CSV Files (
- Git Information (Optional): If run within a Git repository and
git
is installed, includes sections for:- Recent Commit Log (
git log ...
) - Contributors (
git shortlog -sn
) - Tags/Releases (
git tag
) - Local Branches (
git branch
) - Remote URLs (
git remote -v
)
- Recent Commit Log (
- Exclusions:
- Common hidden/build directories and files (
.git
,.venv
,node_modules
,__pycache__
,.DS_Store
,*.pyc
, etc.) are skipped. - The root
.gitignore
file is included by default, but.gitignore
files in subdirectories are skipped. - Generated dumps (
DUMP_FROM_AICD_*.txt
) and otherAICD_*
files are automatically excluded. - Any files, directories, or patterns (using wildcards like
*.tmp
) listed inAICD_ignore
are skipped.
- Common hidden/build directories and files (
- Progress Output: Prints the current dump size and the file being processed (or skipped) to the console during execution.
On the command line, navigate to your project's root directory and execute:
curl -sSL https://gist.githubusercontent.com/cversek/497ada79c245cd8a2cdee2360811f014/raw/AICD_install.sh | bash
Behavior notes:
- If
AICD_system_message.txt
already exists, it will NOT be overwritten, preserving your customizations. - If
AICD_dump.sh
exists, it will be renamed toAICD_dump_OLD_<datetime>.sh
. You'll receive a warning to manually merge any customizations.
If you haven't customized the old dump script, you can safely remove old backups with:
rm AICD_dump_OLD_*.sh
Edit the provided AICD_system_message.txt
file to include:
- Your project's coding preferences and guidelines.
- Instructions or questions for the AI assistant.
- Specific context or clarifications useful for your AI chat sessions.
You can specify files and directories to be ignored by adding them to AICD_ignore
.
Each line in AICD_ignore
should contain a relative path to a file or directory.
# List paths to be ignored by AICD_dump.sh
# Add files/directories below, one per line. Comments start with #.
# Paths are relative to the project root where AICD_dump.sh is run.
AICD_* # Exclude all AICD script/config/dump/backup files
DUMP_FROM_AICD_*.txt # Important, must exclude the current dump otherwise we would be in a recursive loop!
.vscode/ # Ignore VS Code settings folder
node_modules/ # Ignore Node dependencies folder (also ignored by default script logic)
.venv/ # Ignore Python virtual environments (also ignored by default script logic)
__pycache__/ # Ignore Python bytecode cache (also ignored by default script logic)
.git/ # Ignore git metadata (also ignored by default script logic)
# Common Build/Dist/Cache directories
build/
dist/
cache/
*.egg-info/ # Python packaging info
# Common Temporary/OS files
*.pyc
*.pyo
*.swp
*.swo
*~
.DS_Store # macOS specific
# Log files (can be specific paths like logs/ or general like *.log)
*.log
logs/
# Add your project-specific files/directories below:
# e.g., data/large_dataset.bin
# e.g., temp_output/
# e.g., sensitive_config.ini
Ensure required tools (such as tree
) are installed, then execute:
chmod +x AICD_dump.sh
./AICD_dump.sh
This generates the timestamped context file (DUMP_FROM_AICD_<datetime>.txt
).
Start your AI session by uploading (or pasting) DUMP_FROM_AICD_<datetime>.txt
:
- Supported file upload: Attach the file directly.
- No file upload support: Paste the entire contents directly into the chat.
Suggested initial prompt:
"Please review the provided project context (DUMP_FROM_AICD_<datetime>.txt
) and form a complete mental model of my codebase. Provide a concise summary of the project's current state and suggest next steps or improvements."
If you encounter input length or truncation issues:
- ChatGPT: Ask, "It seems part of the dump may have been truncated. Could you specify sections needing more detail?"
- Google Bard: Start with a summarized version, then ask, "I've provided a summary. Let me know if additional details are needed."
- Microsoft Bing Chat: Break the dump into smaller segments and provide clearly labeled parts.
- Claude (Anthropic): Usually handles long texts well, but if necessary, segment your context clearly and submit in parts.
- Other Models: Use a similar strategy—submit in segments if needed, and clarify what information is most important.
To include additional file types in the dump:
- Edit
AICD_dump.sh
. - Locate the regex pattern (search for
EXT_REGEX
). - Add new extensions (e.g.,
r|rmd|jsx|tsx
) separated by the pipe|
character. - Save changes and re-run the script to verify your additions.
Example:
EXT_REGEX='\.(py|txt|md|json|sh|sql|js|jsx|ts|tsx|html|htm|css|sass|scss|java|c|cpp|h|hpp|rb|go|rs|php|pl|r|swift|kt|scala|coffee|yaml|yml|ini|cfg|conf|toml|xml|bat|ps1)$'
- Git: Ensure
git
is installed for cloning the gist. - Tree command (recommended): Install with
sudo apt-get install tree
on Ubuntu/Debian orbrew install tree
on Mac. - Bash 4 or higher: Required for extended globbing functionality.
By following these steps, you can efficiently generate a detailed project context dump and effectively leverage AI assistants to gain valuable insights into your codebase.