To execute, simply run python challenge.py. It should output a word and its occurrence count per line to stdout. Make sure the data file is in the same folder and it's called data.txt. (Tested on Ubuntu 18.04 with Python 3.6.6 and 2.7)
If by what I would do next to put in into production is meant how I could make it more robust, then:
- Parametrize the data file, for example by passing it as an argument or connect it to some sort of document storage.
- Use a much more robust way of tokenizing the input. For example using the NLKT (Natural Language Toolkit) library.
- Also use a better way of counting the words. Numpy has some nice histogram functions.