Skip to content

Instantly share code, notes, and snippets.

View manisnesan's full-sized avatar
🎯
Focusing

Manikandan Sivanesan` manisnesan

🎯
Focusing
View GitHub Profile
@manisnesan
manisnesan / split-dataframe.py
Created April 27, 2020 16:06
train-valid-test validation split
from sklearn.model_selection import train_test_split
def SplitSet(df):
train, test = train_test_split(df, test_size=0.1)
train, valid = train_test_split(df, test_size=0.2)
split_val = len(train)
train = train.append(valid)
return train, test, split_val
df = pd.from_csv(...)
@manisnesan
manisnesan / middleware-process-time.py
Last active March 12, 2020 14:48
Fast API Examples
import time
from fastapi import FastAPI, Request
app = FastAPI()
# Here before retruning the response, a field called 'X-Process-Time' is added to the headers
@app.middleware("http")
async def add_process_time_header(request: Request, call_next):
start_time = time.time()
@manisnesan
manisnesan / outline.md
Last active March 8, 2020 12:37
Outline

Chrestotes

What is Chrestotes

Why Mani doing this

Who am I

Review of background info and Understanding the Context

Working example q => vendor_name:"BalaBit S.á.r.l"

Debug section

{
    "debug": {
        "queryBoosting": {
            "q": "vendor_name:\"BalaBit S.á.r.l\"",
            "match": null
        },
BoostedQuery(
boost(
+(
+(
((psfg:jboss)^2.0 | (psfc:jboss)^4.5 | (psfd:jboss)^3.5 | (psfe:jboss)^4.0 | (psff:jboss)^2.5 | (psfa:jboss)^10.0 | (psfb:jboss)^5.5)
((psfg:jboss)^2.0 | (psfc:jboss)^4.5 | (psfd:jboss)^3.5 | (psfe:jboss)^4.0 | (psff:jboss)^2.5 | (psfa:jboss)^10.0 | (psfb:jboss)^5.5))
-((psfg:webserver-3)^2.0 | (psfc:webserver-3)^4.5 | (psfd:webserver-3)^3.5 | (psfe:webserver-3)^4.0 | (+psff:webserv +psff:3)^2.5 | (psfa:webserver-3)^10.0 | (psfb:webserver-3)^5.5))
@manisnesan
manisnesan / tabular.py
Created February 10, 2020 14:38
TabularV2
# Source: https://muellerzr.github.io/fastshap/
from fastai2.tabular.all import *
## Download
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')
## Preprocessing
dep_var = 'salary'
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
@manisnesan
manisnesan / torch-gpu.py
Created December 5, 2019 22:04
How to check your pytorch is using the GPU?
# https://forums.fast.ai/t/how-to-check-your-pytorch-keras-is-using-the-gpu/7232
import torch
torch.cuda.current_device()
torch.cuda.device(0)
torch.cuda.device_count()
torch.cuda.get_device_name(0)
def load_model():
#global vectorizer, model
vectorizer = pickle.load(open(MODEL_DIR + "/tfidf_vectorizer.pkl", "rb"))
model = pickle.load(open(MODEL_DIR + "/intent_clf.pkl", "rb"))
return vectorizer, model
vectorizer, model = load_model()
/home/msivanes/miniconda3/envs/anlp/lib/python3.6/site-packages/sklearn/base.py:251: UserWarning: Trying to unpickle estimator TfidfTransformer from version 0.21.2 when using version 0.20.2. This might lead to breaking code or invalid results. Use at your own risk.
UserWarning)
@manisnesan
manisnesan / schema_extra_types.xml
Created November 11, 2019 16:31
Field type definitions used in Portal Search fields (psfa, psfb,psfc,psfd, psfe - text_std , psff - text_noUnderscore)
<!-- Standard text field for content types not included in recommendations, portal search -->
<fieldType name="text_std" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<!-- remove period (dot) character from end of token -->
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="(\.)(\s|$)" replacement=" "/>
<!-- remove question mark (?) character from end of token. Index time only -->
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="(\?)(\s|$)" replacement=" "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LengthFilterFactory" min="1" max="10000" />
<filter class="solr.LowerCaseFilterFactory"/>
def keyphrases(text):
# define the set of valid Part Of Speech tags
pos = {'NOUN', 'PROPN', 'ADJ'}
#create a SingleRank extractor
singleRank_extractor = pke.unsupervised.SingleRank()
# load the content of the document
singleRank_extractor.load_document(input=text, language='en', normalization=None)