While maintaining a long-lived production repository, I noticed an unusually large .git/objects/pack file consuming hundreds of MBs.
This was affecting cloning speed, CI performance, and overall repository health.
- 
Checked
.gitfolder size:du -sh .git
 - 
Verified large pack files inside
.git/objects/pack:ls -lh .git/objects/pack
 - 
Analyzed top largest Git objects:
git verify-pack -v .git/objects/pack-xxxx.pack | sort -k3 -n | tail -20
 - 
Mapped object hashes to actual file paths:
git rev-list --objects --all | grep <hash>
 
A Python automation script was created to:
- Identify large Git objects (>10MB)
 - Map them to their actual file paths
 - Confirm user intent
 - Execute 
git filter-repocleanup - Repack repository using 
git gc --aggressive - Measure before/after repo size
 
python analyze-pack.py --cleanupWhen the repo had unstaged changes or wasn’t a fresh clone, the script smartly prompted for --force cleanup if safe.
git filter-repo --invert-paths --path <large-file-path>
git gc --prune=now --aggressive
git remote add origin <your-remote-url>
git push origin main --force- Initial size: 769.58 MB
 - Final size: ~220 MB
 - Reduction: 71% smaller! 🎯
 - History cleaned from large assets like 
.zip,.sql,.psd,.rarfiles. 
- Always inspect 
.git/objects/packwhen repo grows unexpectedly. - Use 
git filter-repo(not BFG) for safe & modern cleanup. - Never forget: after 
git filter-repo, your remotes are removed — re-add them manually! - Automating cleanup ensures consistent reproducibility in DevOps pipelines.
 
Regular repository hygiene saves:
- Developer time (faster clone & fetch)
 - CI/CD cost (smaller artifacts)
 - Fewer headaches in the future 🚀
 
git filter-repodu,grep,awkpyenv+ Python script automation
#GitOptimization #SoftwareEngineering #DevOps #CleanCode #PythonAutomation