Skip to content

Instantly share code, notes, and snippets.

@kevinsimper
Created June 4, 2025 09:02
Show Gist options
  • Save kevinsimper/f14eda93f9a8764cce5efeab5ddf7410 to your computer and use it in GitHub Desktop.
Save kevinsimper/f14eda93f9a8764cce5efeab5ddf7410 to your computer and use it in GitHub Desktop.
Count how many tokens your codebase is
#!/bin/bash
# Count TypeScript files statistics in src directory
echo "=== TypeScript Files Statistics ==="
echo
# Count files
file_count=$(find src -name "*.ts" -o -name "*.tsx" | wc -l)
echo "Total files (.ts and .tsx): $file_count"
# Count words
word_count=$(find src -name "*.ts" -o -name "*.tsx" | xargs wc -w | tail -1 | awk '{print $1}')
echo "Total words: $word_count"
# Estimate tokens (rough approximation: 1 word ≈ 1.3 tokens for code)
tokens_low=$(echo "$word_count * 1.3" | bc | cut -d. -f1)
tokens_high=$(echo "$word_count * 1.5" | bc | cut -d. -f1)
echo "Estimated tokens: $tokens_low - $tokens_high"
echo
echo "=== Breakdown by directory ==="
echo
# Show breakdown by major directories
for dir in src/lib src/model src/routes src/services src/views src/workflows; do
if [ -d "$dir" ]; then
dir_files=$(find "$dir" -name "*.ts" -o -name "*.tsx" | wc -l)
dir_words=$(find "$dir" -name "*.ts" -o -name "*.tsx" | xargs wc -w 2>/dev/null | tail -1 | awk '{print $1}')
dir_tokens_low=$(echo "$dir_words * 1.3" | bc | cut -d. -f1)
dir_tokens_high=$(echo "$dir_words * 1.5" | bc | cut -d. -f1)
printf "%-20s %3d files, %6d words, %6d - %6d tokens\n" "$dir:" "$dir_files" "$dir_words" "$dir_tokens_low" "$dir_tokens_high"
fi
done
echo
echo "=== Top 10 largest files by word count ==="
echo
# Show top 10 largest files
find src -name "*.ts" -o -name "*.tsx" | xargs wc -w | sort -nr | head -11 | tail -10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment