Skip to content

Instantly share code, notes, and snippets.

View jamescalam's full-sized avatar
👻

James Briggs jamescalam

👻
View GitHub Profile
@jamescalam
jamescalam / meditations_shuffle_batch.py
Created April 1, 2020 17:15
Snippet showing example on how to shuffle and batch a dataset after being converted into input/output sequences. Must be a TF Dataset object.
# shuffling
shuffled = dataset.shuffle(BUFFER_SIZE)
# batching
dataset = shuffled.batch(BATCH_SIZE, drop_remainder=True)
@jamescalam
jamescalam / meditations_ensemble.py
Last active April 18, 2020 14:19
Sample control code for ensemble recurrent neural network text generation.
def gladiator_predict(model_list, start, end, sequences=10, vis=False):
text = "" # initialise our text string
meditations = {} # initialise generated text dictionary
models = {} # initialise models dictionary
# loop through each model in the model list and load the model itself and related char2idx mappings
for modelname in model_list:
models[modelname] = {}
models[modelname]['model'] = dw.load_model(modelname)
@jamescalam
jamescalam / hello_lucilius_import.py
Created May 4, 2020 10:29
Importing beautiful soup library and extracting the list of letter links for Epistulae Morales Ad Lucilium.
import requests
from bs4 import BeautifulSoup
# import page containing links to all of Seneca's letters
# get web address
src = "https://en.wikisource.org/wiki/Moral_letters_to_Lucilius"
html = requests.get(src).text # pull html as text
soup = BeautifulSoup(html, "html.parser") # parse into BeautifulSoup object
@jamescalam
jamescalam / hello_lucilius_letter.py
Created May 4, 2020 12:17
Function used to pull a single letter page for the Epistulae Morales Ad Lucilium extraction.
# create function to pull letter from webpage (pulls text within <p> elements
def pull_letter(http):
# get html from webpage given by 'http'
html = requests.get(http).text
# parse into a beautiful soup object
soup = BeautifulSoup(html, "html.parser")
# build text contents within all p elements
txt = '\n'.join([x.text for x in soup.find_all('p')])
@jamescalam
jamescalam / hello_lucilius_all_letters.py
Created May 4, 2020 12:39
Creating a dictionary containing the moral letters to lucilius data. Dictionary key is letter number, value contains [local href, letter contents].
# compile RegEx for finding 'Letter 12', 'Letter 104' etc
letters_regex = re.compile("^Letter\s+[0-9]{1,3}$")
# create dictionary containing letter number: [local href, letter contents] for all that satisfy above RegEx
moral_letters = {
x.contents[0]:
[x.get('href'), pull_letter(f"https://en.wikisource.org{x.get('href')}")]
for x in soup.find_all('a')
if len(x.contents) > 0
if letters_regex.match(str(x.contents[0]))
@jamescalam
jamescalam / hello_lucilius_data_idx.py
Last active May 4, 2020 13:39
Converting text data to index data using a char2idx mapping dictionary.
# join the moral letters dictionary into a single string
txt = "\n".join([moral_letters[key][1] for key in moral_letters)
# create vocab from text string (txt)
vocab = sorted(set(txt))
# create char2idx mappings from the vocabulary
char2idx = {c: i for i, c in enumerate(vocab)}
# converting data from characters to indexes
@jamescalam
jamescalam / hello_lucilius_dataset.py
Created May 4, 2020 14:56
Creating the dataset object ready for training in tensorflow.
# define the input/target data splitting function
def split_xy(seq):
input_data = seq[:-1]
target_data = seq[1:]
return input_data, target_data
SEQLEN = 100 # the number of characters in a single sequence
BATCHSIZE = 64 # how many sequences in a single training batch
BUFFER = 10000 # how many elements are contained within a single shuffling space
@jamescalam
jamescalam / function_parameter_match_regex.py
Created May 5, 2020 12:41
Code snippet demonstrating how to match parameter details from function docstrings formatted to numpy/scipy docstring standards.
# first run parameter match regex
param_re = re.compile(r"(?s)\w+ : .*?(?=\w+ :)")
# the above will not find the final parameter, this will
param_re2 = re.compile(r"(?s)\w+ : .*")
params = [] # initialise parameter list
while True:
# find a parameter
new_param = param_re.search(text)
@jamescalam
jamescalam / function_parameter_extract.py
Last active May 5, 2020 13:11
Code snippet demonstrating how to extract all parameter details from function docstring using numpy/scipy docstring standards.
# first run parameter match regex
param_re = re.compile(r"(?s)\w+ : .*?(?=\w+ :)")
# the above will not find the final parameter, this will
param_re2 = re.compile(r"(?s)\w+ : .*")
params = {} # initialise parameter dictionary
while True:
# find a parameter
new_param = param_re.search(text)
@jamescalam
jamescalam / head_example.html
Created May 5, 2020 17:23
Minimal amount needed in documentation html head.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<title>Docs</title>