Skip to content

Instantly share code, notes, and snippets.

@rhcarvalho
Created March 3, 2014 15:57
Show Gist options
  • Save rhcarvalho/9328015 to your computer and use it in GitHub Desktop.
Save rhcarvalho/9328015 to your computer and use it in GitHub Desktop.
Finding typos in Django

Usage

  1. Run collectwords.py with file paths as arguments to build a database of words.
  2. Run spellcheck.py to mark misspells.
  3. Use sqlite3 shell or anything else to output misspelled words to a file.
  4. Go through the file eliminating false positives.
  5. Search through the codebase and fix typo by typo :-)
import sys
import re
import sqlite3
c = sqlite3.connect('words.sqlite')
c.execute("""
CREATE TABLE IF NOT EXISTS words (word TEXT UNIQUE, misspell bool)
""")
def insert_words(words):
return c.executemany("""
INSERT OR IGNORE INTO words (word) VALUES (?)
""", ((w,) for w in words))
for f in sys.argv[1:]:
print f,
t = open(f).read()
words = (mo.group(0) for mo in re.finditer(r"\b\w+\b", t))
with c:
cursor = insert_words(words)
print cursor.rowcount
import sqlite3
import enchant
c = sqlite3.connect('words.sqlite')
d = enchant.Dict("en_US")
def check_spelling(word):
return d.check(word)
c.create_function("check_spelling", 1, check_spelling)
with c:
cursor = c.execute("""
UPDATE words
SET misspell = check_spelling(word)
WHERE misspell IS NULL
""")
print cursor.rowcount
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment