Spark
If you have (larger-than-memory) petabytes of JSON/XML/CSV files, a simple workflow, and a thousand node cluster
Dask
If you have (larger-than-memory) 10s-1000s of gigabytes of binary or numeric data (e.g., HDF5, netCDF4, CSV.gz), complex algorithms, and a (single) large multi-core workstation
SQLite
If you have (larger-than-memory or not) less than a terabyte of content and one writer at a time, if you need local (on-disk) data storage (permanent or temporary) for individual applications or device, if you need to query/analyze a large dataset of text files: CSV/XML (off-memory), if you want to stick to the standard library (is built-in in Python)
MongoDB, PostgreSQL
If you have (larger-than-memory) a terabyte or less of JSON/XML/CSV, if you have multiple writers at a time, if you need/want a cliet/server scheme
Cassandra
If you have lots of data coming in very quickly (from different locations), of etherogeneous types (schema-less), of many terabytes or petabytes in size, if you need multiple servers/distributed system (with potential expansion in future), if you need constant availability (fault-tolerant), and yet simple
Cython
If your code will be deploied by others, distributing a package with the optimized extentions, if you need to accelerate code that uses advanced Python features (e.g., list, dict, recursion, array allocation), if you need to directly call C, if your function operates on a pre-defined (fixed) number of dimensions
Numba
If you don’t need to distribute your code beyond your computer or your team (especially if you use Conda), if you need to accelerate code that uses scalars or (N-dimensional) arrays, if you want to write functions that work (automatically) on N-dimensional arrays