jamescalam’s gists

jamescalam / meditations_shuffle_batch.py

Created April 1, 2020 17:15

Snippet showing example on how to shuffle and batch a dataset after being converted into input/output sequences. Must be a TF Dataset object.

	# shuffling
	shuffled = dataset.shuffle(BUFFER_SIZE)
	# batching
	dataset = shuffled.batch(BATCH_SIZE, drop_remainder=True)

jamescalam / meditations_ensemble.py

Last active April 18, 2020 14:19

Sample control code for ensemble recurrent neural network text generation.

	def gladiator_predict(model_list, start, end, sequences=10, vis=False):

	text = "" # initialise our text string
	meditations = {} # initialise generated text dictionary
	models = {} # initialise models dictionary

	# loop through each model in the model list and load the model itself and related char2idx mappings
	for modelname in model_list:
	models[modelname] = {}
	models[modelname]['model'] = dw.load_model(modelname)

jamescalam / hello_lucilius_import.py

Created May 4, 2020 10:29

Importing beautiful soup library and extracting the list of letter links for Epistulae Morales Ad Lucilium.

	import requests
	from bs4 import BeautifulSoup

	# import page containing links to all of Seneca's letters
	# get web address
	src = "https://en.wikisource.org/wiki/Moral_letters_to_Lucilius"

	html = requests.get(src).text # pull html as text
	soup = BeautifulSoup(html, "html.parser") # parse into BeautifulSoup object

jamescalam / hello_lucilius_letter.py

Created May 4, 2020 12:17

Function used to pull a single letter page for the Epistulae Morales Ad Lucilium extraction.

	# create function to pull letter from webpage (pulls text within <p> elements
	def pull_letter(http):

	# get html from webpage given by 'http'
	html = requests.get(http).text
	# parse into a beautiful soup object
	soup = BeautifulSoup(html, "html.parser")

	# build text contents within all p elements
	txt = '\n'.join([x.text for x in soup.find_all('p')])

jamescalam / hello_lucilius_all_letters.py

Created May 4, 2020 12:39

Creating a dictionary containing the moral letters to lucilius data. Dictionary key is letter number, value contains [local href, letter contents].

	# compile RegEx for finding 'Letter 12', 'Letter 104' etc
	letters_regex = re.compile("^Letter\s+[0-9]{1,3}$")

	# create dictionary containing letter number: [local href, letter contents] for all that satisfy above RegEx
	moral_letters = {
	x.contents[0]:
	[x.get('href'), pull_letter(f"https://en.wikisource.org{x.get('href')}")]
	for x in soup.find_all('a')
	if len(x.contents) > 0
	if letters_regex.match(str(x.contents[0]))

jamescalam / hello_lucilius_data_idx.py

Last active May 4, 2020 13:39

Converting text data to index data using a char2idx mapping dictionary.

	# join the moral letters dictionary into a single string
	txt = "\n".join([moral_letters[key][1] for key in moral_letters)

	# create vocab from text string (txt)
	vocab = sorted(set(txt))

	# create char2idx mappings from the vocabulary
	char2idx = {c: i for i, c in enumerate(vocab)}

	# converting data from characters to indexes

jamescalam / hello_lucilius_dataset.py

Created May 4, 2020 14:56

Creating the dataset object ready for training in tensorflow.

	# define the input/target data splitting function
	def split_xy(seq):
	input_data = seq[:-1]
	target_data = seq[1:]
	return input_data, target_data

	SEQLEN = 100 # the number of characters in a single sequence
	BATCHSIZE = 64 # how many sequences in a single training batch
	BUFFER = 10000 # how many elements are contained within a single shuffling space

jamescalam / function_parameter_match_regex.py

Created May 5, 2020 12:41

Code snippet demonstrating how to match parameter details from function docstrings formatted to numpy/scipy docstring standards.

	# first run parameter match regex
	param_re = re.compile(r"(?s)\w+ : .*?(?=\w+ :)")
	# the above will not find the final parameter, this will
	param_re2 = re.compile(r"(?s)\w+ : .*")

	params = [] # initialise parameter list

	while True:
	# find a parameter
	new_param = param_re.search(text)

jamescalam / function_parameter_extract.py

Last active May 5, 2020 13:11

Code snippet demonstrating how to extract all parameter details from function docstring using numpy/scipy docstring standards.

	# first run parameter match regex
	param_re = re.compile(r"(?s)\w+ : .*?(?=\w+ :)")
	# the above will not find the final parameter, this will
	param_re2 = re.compile(r"(?s)\w+ : .*")

	params = {} # initialise parameter dictionary

	while True:
	# find a parameter
	new_param = param_re.search(text)

jamescalam / head_example.html

Created May 5, 2020 17:23

Minimal amount needed in documentation html head.

	<!DOCTYPE html>
	<html lang="en">

	<head>

	<meta charset="utf-8">
	<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">

	<title>Docs</title>

James Briggs jamescalam