Code review notes

Mahmoud Hashemi, October 2017

download_epa.sh

could be ported python and perhaps parameterized (i.e., not hardcoding 2016 data)

expects a specific file, might be better to merge this functionality and fetching the data into a single CLI frontend
Do we want to add a bit more logic for format testing the input files? Like basically, do the column names and number of columns match up?
I might move some local variables like chunk/chunksize values to globals or even arguments.
only adds a single index, something to keep an eye on perhaps
line 140 has an assert with a try/except, better to use a simple if statement, especially since the error is not fatal.
line 262 is another instance of catching an AssertionError. Another reason to avoid asserts for production requirements is that supposedly some folks run python with the -O parameter, "optimizing" it, and making it skip assertions.
Nice messaging, especially the note about how messages from the program will begin with arrows, helps disambiguate streams a bit.
Have you tried wrapping the whole operation in a transaction? I see right now there's no automatic rollback on cancellation or failure, meaning that the database can be left in a bad state. At the very least we'll need a drop_schema.py or something to reset the db. Sometimes having a transaction can speed things up, because autocommit is slow. Don't forget that flush() is a thing, too.
This tool could be put into the package and exposed as a separate entrypoint, as well.

The _about.py loading approach is a fine one, I think, but maybe make the path less relative. Right now it relies on setup.py being run with the project directory as the working directory. Simply make it __file__ relative.
setuptools might also be good to add to setup_requires and/or install_requires
Definitely want to add at least some loose version requirements to the requirements.

For __repr__ and friends, it's good practice for refactorability and inheritability to reference self.__class__.__name__ instead of hardcoding the type name. (e.g., line 63)
It's interesting how the str and repr differ. Remember __repr__ is meant to be round-trippable, and right now the env isn't in there, so the signatures don't match.
For to_json() and to_html(), my expectation would be that these would return JSON and HTML text strings, respectively, similar to how to_dict() returns a dict. Right now they output files. If that's the desired behavior, consider changing the names to export_*() or save_*().

get_notes() could simply be return cmg.info.get('notes', '')
line 64 in info_to_context doesn't need to be an .update(). simply do ret[key] = info[key]

If this program is meant for python3 and python3 only, as the setup.py suggests, then no future import unicode_literals is necessary.
Design-wise, it's a good idea to write some stuff to stdout to keep the user informed on program execution, but to keep a much more complete record in a single file you can kick over when a crash or bug occurs. This means redirecting sys.stdout on subprocesses, and writing arguments and so forth, as well.

Maybe want to optionally enable sqlalchemy query logs in debug mode.
Would avoid using __call__ unless necessary. Even when necessary, better to have a normal method name (e.g., get_results()) and do __call__ = get_results in the class definition.

Automated integration test would be ideal.
Once APIs settle down a bit, would be good to get coverage in here.
Look into tox, add some packaging testing a la https://github.com/python-hyper/hyperlink/blob/2bc8dff95e8b1394c35bde1b5d40ee78e42c92df/tox.ini#L19
Get some CI going on (if only for unit tests and packaging tests)