okdolly-001’s gists

okdolly-001 / timeit.py

Created January 22, 2018 01:25

print the running time

	from datetime import datetime
	startTime = datetime.now()

	#do something

	print datetime.now() - startTime

okdolly-001 / delete_part_string.py

Last active March 17, 2018 10:21

delete part of a string in pandas #pandas #python

	#deleting a char directly in csv
	infile = "test.csv"
	outfile = "test_edit.csv"

	delete_list = ["b'", "'"]
	fin = open(infile)
	fout = open(outfile, "w+")
	for line in fin:
	for word in delete_list:
	line = line.replace(word, "")

okdolly-001 / panda_delete_col.py

Last active March 16, 2018 02:59

Deleting columns with all of the same value #pandas

df.drop(df.std()[(df.std() == 0)].index, axis=1)

okdolly-001 / load_npy.py

Last active March 17, 2018 10:19

load #np

	# and save to csv
	train = np.load('ecs171train.npy')
	test = np.load('ecs171train.npy')
	#Use atom and terminal to load the whole dataset if it's over 200+MB
	train = train[:51000]
	test = test[:51000]
	np.savetxt('test.csv', test, fmt='%s')

okdolly-001 / duplicate_col.py

Created March 16, 2018 03:04

Delete duplicate columns by value #pandas

	#Pandas deleting duplicate columns by value
	df = pd.read_csv('data/test.csv')
	df = df.loc[:, ~df.T.duplicated()]
	df.to_csv('data/test.csv')

okdolly-001 / xgb.sklearn.py

Last active March 16, 2018 08:53

xgb.sklearn

	clf = xgb.sklearn.XGBClassifier(
	objective="binary:logistic",
	learning_rate=0.05,
	seed=9616,
	max_depth=20,
	gamma=10,
	n_estimators=500)

	clf.fit(X_train, Y_train, early_stopping_rounds=20, eval_metric="auc", eval_set=eval_set, verbose=True)

okdolly-001 / binarize_panda.py

Created March 17, 2018 06:47

Binarize integer in a pandas dataframe #pandas

y_train['loss'] = (y_train['loss'] > 0.0).astype(int)

okdolly-001 / prepend_a_line.py

Created March 17, 2018 10:22

Prepend line to beginning of a file #file #python

	#Prepend a line at the top of the file
	def line_prepender(filename, line):
	with open(filename, 'r+') as f:
	content = f.read()
	f.seek(0, 0)
	f.write(line.rstrip('\r\n') + '\n' + content)


	infile = "c_test.csv"
	outfile = "complete_test.csv"

okdolly-001 / check_nan_inf.py

Created March 17, 2018 21:23

Check Nan and infinity in pandas and Numpy #np #pandas

	1. Check if all elements in pandas/numpy are finite. notnull = isfinite

	Pandas:
	X_train.notnull().values.all()

	np.isfinite(X_train).all()


	2. Check if any elements in Pandas is Na.
	Pandas:

okdolly-001 / replace_pandas.py

Created March 22, 2018 01:39

Replace values in dataframe from another dataframe ? #pandas

	1. Substitute the NaN's in a dataframe with values from another dataframe

	If you have two DataFrames of the same shape, then:

	df[df.isnull()] = d2



	2.Replace values in a dataframe with values from another dataframe by conditions

corgi okdolly-001