This project demonstrates how to put together a list of words for Apache Beam. This list of words is meant to be used to launch a Beam wordle game. The project has the following pieces:
- A Beam Pipeline that scrapes the Beam website in
scrape-beam-words.py
. The outputs of this pipeline are:beam_wordle_words.txt*
: A long list of words, separated by newline.beam_wordle_histogram.txt*
: A file detailing the frequency in word lengths for the list of words. This is to help select the right word length for the wordle.
First you'll need to set up a virtual environment, and install the pipeline requirements:
virtualenv venv
. venv/bin/activate
pip install -r requirements.txt
Once that's all installed, you can run the Beam pipeline to scrape the data:
python scrape-beam-words.py