Manikandan Sivanesan` manisnesan

🎯

Focusing

Software Builder & Tinkerer.

manisnesan / split-dataframe.py

Created April 27, 2020 16:06

train-valid-test validation split

	from sklearn.model_selection import train_test_split

	def SplitSet(df):
	train, test = train_test_split(df, test_size=0.1)
	train, valid = train_test_split(df, test_size=0.2)
	split_val = len(train)
	train = train.append(valid)
	return train, test, split_val

	df = pd.from_csv(...)

manisnesan / middleware-process-time.py

Last active March 12, 2020 14:48

Fast API Examples

	import time

	from fastapi import FastAPI, Request

	app = FastAPI()

	# Here before retruning the response, a field called 'X-Process-Time' is added to the headers
	@app.middleware("http")
	async def add_process_time_header(request: Request, call_next):
	start_time = time.time()

manisnesan / outline.md

Last active March 8, 2020 12:37

Outline

Chrestotes

manisnesan / example.md

Last active February 14, 2020 15:16

{
    "debug": {
        "queryBoosting": {
            "q": "vendor_name:\"BalaBit S.á.r.l\"",
            "match": null
        },

manisnesan / gist:908b9c91434cd1bce617f8feb1add99b

Created February 12, 2020 22:43

	BoostedQuery(
	boost(
	+(
	+(
	((psfg:jboss)^2.0 \| (psfc:jboss)^4.5 \| (psfd:jboss)^3.5 \| (psfe:jboss)^4.0 \| (psff:jboss)^2.5 \| (psfa:jboss)^10.0 \| (psfb:jboss)^5.5)

	((psfg:jboss)^2.0 \| (psfc:jboss)^4.5 \| (psfd:jboss)^3.5 \| (psfe:jboss)^4.0 \| (psff:jboss)^2.5 \| (psfa:jboss)^10.0 \| (psfb:jboss)^5.5))

	-((psfg:webserver-3)^2.0 \| (psfc:webserver-3)^4.5 \| (psfd:webserver-3)^3.5 \| (psfe:webserver-3)^4.0 \| (+psff:webserv +psff:3)^2.5 \| (psfa:webserver-3)^10.0 \| (psfb:webserver-3)^5.5))

manisnesan / tabular.py

Created February 10, 2020 14:38

TabularV2

	# Source: https://muellerzr.github.io/fastshap/
	from fastai2.tabular.all import *

	## Download
	path = untar_data(URLs.ADULT_SAMPLE)
	df = pd.read_csv(path/'adult.csv')

	## Preprocessing
	dep_var = 'salary'
	cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']

manisnesan / torch-gpu.py

Created December 5, 2019 22:04

How to check your pytorch is using the GPU?

	# https://forums.fast.ai/t/how-to-check-your-pytorch-keras-is-using-the-gpu/7232
	import torch
	torch.cuda.current_device()
	torch.cuda.device(0)
	torch.cuda.device_count()
	torch.cuda.get_device_name(0)

manisnesan / gist:1f12a6f8918ba3a2b2a46dbb56dc2b32

Created November 21, 2019 14:50

	def load_model():
	#global vectorizer, model
	vectorizer = pickle.load(open(MODEL_DIR + "/tfidf_vectorizer.pkl", "rb"))
	model = pickle.load(open(MODEL_DIR + "/intent_clf.pkl", "rb"))
	return vectorizer, model

	vectorizer, model = load_model()

	/home/msivanes/miniconda3/envs/anlp/lib/python3.6/site-packages/sklearn/base.py:251: UserWarning: Trying to unpickle estimator TfidfTransformer from version 0.21.2 when using version 0.20.2. This might lead to breaking code or invalid results. Use at your own risk.
	UserWarning)

manisnesan / schema_extra_types.xml

Created November 11, 2019 16:31

Field type definitions used in Portal Search fields (psfa, psfb,psfc,psfd, psfe - text_std , psff - text_noUnderscore)

	<!-- Standard text field for content types not included in recommendations, portal search -->
	<fieldType name="text_std" class="solr.TextField" positionIncrementGap="100">
	<analyzer type="index">
	<!-- remove period (dot) character from end of token -->
	<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="(\.)(\s\|$)" replacement=" "/>
	<!-- remove question mark (?) character from end of token. Index time only -->
	<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="(\?)(\s\|$)" replacement=" "/>
	<tokenizer class="solr.WhitespaceTokenizerFactory"/>
	<filter class="solr.LengthFilterFactory" min="1" max="10000" />
	<filter class="solr.LowerCaseFilterFactory"/>

manisnesan / keyphrases.py

Created October 14, 2019 13:06

	def keyphrases(text):

	# define the set of valid Part Of Speech tags
	pos = {'NOUN', 'PROPN', 'ADJ'}

	#create a SingleRank extractor
	singleRank_extractor = pke.unsupervised.SingleRank()

	# load the content of the document
	singleRank_extractor.load_document(input=text, language='en', normalization=None)