Last active
April 4, 2020 17:19
-
-
Save jamescalam/6f104337ab9c826297da6af41bf5c79d to your computer and use it in GitHub Desktop.
Code snippet for part of data cleansing process for Meditations data import.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import re | |
# import Meditations | |
response = requests.get('http://classics.mit.edu/Antoninus/meditations.mb.txt') | |
data = response.text | |
# clean the text | |
data = data.split("Translated by George Long")[1].replace("-", "").split("THE END")[0] | |
data = re.sub("BOOK [A-Z]+\n", "", data) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment