elyase’s gists

elyase / gist:8161939

Last active January 1, 2016 15:08

elyase / gist:7932059

Last active December 31, 2015 04:09

	{'INTEGER': 12, 'HTMLTAG': 10, 'int': 4, 'formattedsize': 4, 'sizeindex': 4, 'string': 3,
	'sizes': 3, 'decimals': 3, 'size': 2, 'code': 2, 'blockquote': 2, 'permitted': 2, 'specifiers': 2,
	'default': 2, 'parameter': 2, 'FUNCTIONCALL': 2, 'CODE': 1, 'private': 1, 'eb': 1, 'gt': 1,
	'gb': 1, 'error': 1, 'application': 1, 'format': 1, 'desktop': 1, 'pb': 1, 'formatsizebinary': 1,
	'lt': 1, 'tb': 1, 'math': 1, 'return': 1, 'kb': 1, 'yb': 1, 'tostring': 1, 'zb': 1, 'amp': 1,
	'mb': 1, 'bytes': 1, 'length': 1, 'double': 1}


	['default',
	'parameter',

elyase / gist:7594427

Created November 22, 2013 03:41

elyase / gist:7556915

Last active December 28, 2015 20:19

	import csv

	with open('companies.csv', 'wb') as csvfile:
	csv.writer(csvfile, delimiter=',').writerows(row_gen)

elyase / gist:7556908

Last active December 28, 2015 20:19

	row_gen = ( [td.text(), td.next().text()] # left, right element
	for table in d('.borderless').items()
	for td in table('td:nth-child(1)').items() # left column
	if table('th:first').text() == 'NUANS Reports & Preliminary Searches' and
	td.next().text() in ('Active', 'Inactive') )

	10 loops, best of 3: 172 ms per loop

elyase / gist:7556900

Created November 20, 2013 02:55

	l = []
	for th in d.items('.borderless td:nth-child(1)'):
	left = th.text()
	right = th.next().text()
	tr = th.parent()
	tbody = tr.parent()
	title = tbody('th:first').text() # first element
	if title == 'NUANS Reports & Preliminary Searches' and right in ['Active', 'Inactive']:
	l.append([left, right])

elyase / gist:7555860

Created November 20, 2013 01:19

	from pyquery import PyQuery as pq

	url = 'https://www.nuans.com/RTS2/en/jur_codes-codes_jur_en.cgi#Example_of_report_layouts'
	d = pq(url)

elyase / count_motifs.py

Last active December 28, 2015 13:39

Counts motifs appearances in a list of DNA sequences

	from sklearn.feature_extraction.text import CountVectorizer
	import numpy as np

	def tokenizer(s):
	width = 7
	return [s[i:i+width] for i in range(len(s)-width+1)]

	def count_chunks(sequence_list):
	vectorizer = CountVectorizer(tokenizer=tokenizer)
	X = vectorizer.fit_transform(sequence_list)

elyase / gist:7050488

Last active December 25, 2015 22:29

	# Split the dataset in two equal parts
	X_train, X_test, y_train, y_test = train_test_split(
	X, y, test_size=0.5, random_state=0)

	# Set the parameters by cross-validation
	tuned_parameters = [{'kernel': ['rbf'], 'gamma': [1e-3, 1e-4],
	'C': [1, 10, 100, 1000]},
	{'kernel': ['linear'], 'C': [1, 10, 100, 1000]}]
	model = GridSearchCV(SVC(C=1), tuned_parameters, cv=5, scoring=score)
	model.fit(X_train, y_train)

elyase / gist:7050373

Created October 19, 2013 00:44 — forked from anonymous/gist:7050368

	function svmStruct = best_svm_classifer_rbf(cdata,labels)


	%Write a function called crossfun to calculate the predicted classification yfit from a test vector
	%xtest, when the SVM is trained on a sample xtrain that has classification ytrain.

	function yfit = crossfun(xtrain,ytrain,xtest, rbf_sigma, boxconstraint)

	% Train the model on xtrain, ytrain,
	% and get predictions of class of xtest and output it as yfit

Yaser Martinez Palenzuela elyase