Thiago Vieira tcvieira

👨‍💻

typing

tcvieira / missing.py

Created January 21, 2019 14:09 — forked from GeorgeSeif/missing.py

	# Filling in NaN values of a particular feature variable
	avg_height = 67 # Maybe this is a good number
	data["height"] = data["height"].fillna(avg_height)

	# Filling in NaN values with a calculated one
	avg_height = data["height"].median() # This is probably more accurate
	data["height"] = data["height"].fillna(avg_height)

	# Dropping rows with missing values
	# Here we check which rows of "height" aren't null

tcvieira / format.py

Created January 21, 2019 14:09 — forked from GeorgeSeif/format.py

	# Formattinng data
	data['state'] = data['state'].str.upper() # Capitalize the whole thing
	data['state'] = data['state'].replace( # Changing the format of the string
	to_replace=["CA", "C.A", "CALI"],
	value=["CALIFORNIA", "CALIFORNIA", "CALIFORNIA"])

	# Dates and times are quite common in large datasets
	# Converting all strings to datetime objects is good standardisation practice
	# Here, the data["time"] strings will look like "2019-01-15", which is exactly
	# how we set the "format" variable below

tcvieira / setupFastaiV1.md

Last active October 12, 2021 13:10

Setup Fast.ai v1 on Paperspace Fast.ai Template

Setup Fastai v1 on Paperspace

tcvieira / speed_up.py

Created May 5, 2018 14:35 — forked from JeyKJey/speed_up.py

	import numpy as np
	import multiprocessing as multi

	def chunks(n, page_list):
	"""Splits the list into n chunks"""
	return np.array_split(page_list,n)

	cpus = multi.cpu_count()
	workers = []
	page_list = ['www.website.com/page1.html', 'www.website.com/page2.html'

tcvieira / min-char-rnn.py

Created August 4, 2016 12:28 — forked from karpathy/min-char-rnn.py

Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy

	"""
	Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
	BSD License
	"""
	import numpy as np

	# data I/O
	data = open('input.txt', 'r').read() # should be simple plain text file
	chars = list(set(data))
	data_size, vocab_size = len(data), len(chars)