Skip to content

Instantly share code, notes, and snippets.

@davebarkerxyz
Last active May 19, 2020 07:43
Show Gist options
  • Select an option

  • Save davebarkerxyz/21fbffd7a7990f5e066c to your computer and use it in GitHub Desktop.

Select an option

Save davebarkerxyz/21fbffd7a7990f5e066c to your computer and use it in GitHub Desktop.
Rebuild Flask-WhooshAlchemy search indices
#!/usr/bin/env python
import datetime
from app import app, models
import whoosh
import flask_whooshalchemy
"""
Rebuild all Whoosh search indices
Useful after manually importing data (side-stepping the SQLAlchemy ORM
and automatic Whoosh index updates)
If this is intended as a full rebuild, you should consider deleting the
Whoosh search database (as specified in app.config["WHOOSH_BASE"])
before running the rebuild. This will ensure that no old/stale
data is left in the search indices (this process doesn't delete removed
data, only recreated search entries for current data).
"""
program_start = datetime.datetime.utcnow()
def log(message):
logtime = datetime.datetime.utcnow()
logdiff = logtime - program_start
print("{0} (+{1:.3f}): {2}".format(logtime.strftime("%Y-%m-%d %H:%M:%S"),
logdiff.total_seconds(),
message))
def rebuild_index(model):
"""Rebuild search index of Flask-SQLAlchemy model"""
log("Rebuilding {0} index...".format(model.__name__))
primary_field = model.pure_whoosh.primary_key_name
searchables = model.__searchable__
index_writer = flask_whooshalchemy.whoosh_index(app, model)
# Fetch all data
entries = model.query.all()
entry_count = 0
with index_writer.writer() as writer:
for entry in entries:
index_attrs = {}
for field in searchables:
index_attrs[field] = unicode(getattr(entry, field))
index_attrs[primary_field] = unicode(getattr(entry, primary_field))
writer.update_document(**index_attrs)
entry_count += 1
log("Rebuilt {0} {1} search index entries.".format(str(entry_count), model.__name__))
if __name__ == "__main__":
model_list = [models.Product,
models.Commodity,
models.Category,
models.Page]
for model in model_list:
rebuild_index(model)
@amumtaz
Copy link
Copy Markdown

amumtaz commented Aug 14, 2015

I am new to Flask/Whoosh and working in virtualenv (Python27) on indexing some text. I keep getting the following error when I run the script:

File "rebuildwhooshindex.py", line 4, in
import lib
ImportError: No module named lib

Any suggestions?

@dhamaniasad
Copy link
Copy Markdown

You can just remove the import lib line and the script will work. As you can see, lib is not being used anywhere in the program.

@davebarkerxyz
Copy link
Copy Markdown
Author

@amumtaz @dhamaniasad Oops, my bad. Lib was a project specific package where I was holding frozen libs to avoid polluting our CI & production servers' site-packages. I'd probably do things differently this time around. I've removed the offending line.

@chrisphyffer
Copy link
Copy Markdown

chrisphyffer commented May 30, 2017

Thank you very much Dav, this snippet is very helpful. :)

@tanatarca
Copy link
Copy Markdown

Hi! I have a problem here, my model seems to not have the attribute pure_whoosh... I did create the index before, I have realized whoosh queries without problem, so I don't know where this issue comes from.

@SollyTaylor
Copy link
Copy Markdown

model.pure_whoosh.primary_key_name should be added by whoosh_index function
in the flask_whooshalchemy.py file,
whoose_index function calls _create_index where the pure_whoose and whoosh_primary_key are intentionally added to the model

model.pure_whoosh = _Searcher(primary_key, indx)
model.whoosh_primary_key = primary_key

I wrote some code like by flask script sth like:

wa.whoosh_index(app, Contract)
rebuild_index(Contract, app)

which eventually rebuilt the indexes.

@aravergar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment