NOTE: I'll work at fleshing this out a bit better
GOAL: Retrieve, catalog and display a daily record of voter registration figures in Orange County because a spokesperson for the registrar's office said they are unable to provide the figures.
TO PULL THIS OFF:
- Determine when the figures are updated each day
- HOW: Phone conversation with registrar, reporter
- Navigate to the O.C. URL that contains the daily figures
- Create a scheduler
- HOW: Create an AWS Lambada function that is tied to CloudWatch scheduler
- https://medium.com/@kagemusha_/scraping-on-a-schedule-with-aws-lambda-and-cloudwatch-caf65bc38848
- Scrape those figures
-
HOW: Python, requests and BeautifulSoup
-
Save them to a .csv file stored on AWS S3
- HOW: Tie my AWS Lambada function to S3
-
Send a message to a Slack channel with the figures
- HOW: Use incoming webhooks
def send_slack_message(webhook_url, data): slack_message = 'Updated Orange County Voter Registration\nas of {0}\n'.format(datetime.datetime.now()) for item in data: slack_message += '\t{0}: {1}\n'.format(item['scope'], item['number']) post = {'text': '{0}'.format(slack_message)} json_data = json.dumps(post) try: req = urllib.request.Request( webhook_url, data=json_data.encode('ascii'), headers={'Content-Type': 'application/json'} ) resp = urllib.request.urlopen(req) logger.info(resp) except Exception as em: print('exception: ' + str(em))
-
- Create a scheduler
- Visualize the figures each day in Observable notebook
- By connecting to the S3 .csv
- HOW: Create Observable notebook and use the
aws-sdk
Javascript library
- HOW: Create Observable notebook and use the
- Charting the data
- HOW: Use Vega-Lite
- By connecting to the S3 .csv