Skip to content

Instantly share code, notes, and snippets.

View karimkhanp's full-sized avatar

Karimkhan karimkhanp

View GitHub Profile
@karimkhanp
karimkhanp / opennmt_cheatshit
Created October 8, 2018 07:42
Opennmt documentation
About Perplexity - https://planspace.org/2013/09/23/perplexity-what-it-is-and-what-yours-is/
@karimkhanp
karimkhanp / datascience_internview_question
Last active December 27, 2021 13:57
Data science interview question
https://www.simplilearn.com/tutorials/deep-learning-tutorial/deep-learning-interview-questions
https://www.javatpoint.com/deep-learning-interview-questions
Difference between training, dev and test set
A training dataset is a dataset of examples used during the learning process and is used to fit the parameters (e.g., weights) of, for example, a classifier.[7][8]
Dev/Validation : A validation dataset is a dataset of examples used to tune the hyperparameters (i.e. the architecture) of a classifier. It is sometimes also called the development set or the "dev set". An example of a hyperparameter for artificial neural networks includes the number of hidden units in each layer.
A test dataset is a dataset that is independent of the training dataset, but that follows the same probability distribution as the training dataset.
What is bias?
Bias is the difference between the average prediction of our model and the correct value which we are trying to predict. Model with high bias pays very little attent
@karimkhanp
karimkhanp / nlp_preprocess.py
Created July 23, 2018 14:54
nlp pre processing using nltk
import sys, pdb
import nltk, pprint
from nltk.tokenize import word_tokenize
from nltk.tokenize import sent_tokenize
from nlp_opn import CuriaNLP
from mongo_op import MongoOperation
"""
NltkSentTokenize Class for all nltk sent tokenize
"""
@karimkhanp
karimkhanp / mlbasics
Last active June 22, 2018 14:01
Basics of Machine learning
Regression: (https://www.quora.com/What-is-regression)
Regression is the dependence of one variable over the other variable is termed as “Regression”. the statistical method which helps us to estimate the unknown value of one variable (dependent variable) from the known value of the related variable (independent variable) is called Regession
Regression estimates the relationship among variables for prediction.
Regression analysis helps to understand how the dependent variable changes when some of the independent variables are varied, while the other independent variables are held fixed.
It determines the relationship between one dependent variable and a number of other independent variables.
Linear Regression
A Simple Linear Regression allows you to determine functional dependency between two sets of numbers. For example, we can use regression to determine the relation between ice cream sales and average temperature.
@karimkhanp
karimkhanp / nltk_functions.py
Created February 12, 2018 11:49
Contains various nltk function
import sys
"""
NltkSentTokenize Class for all nltk sent tokenize
"""
class NltkSentTokenize(object):
"""
Initialization function of NltkSentTokenize Class
"""
def __init__(self):
@karimkhanp
karimkhanp / numerical_analysis
Created October 31, 2017 16:12
Steps and resources for numerical analysis
Links
http://bridgei2i.com/ebook/churn-propensity-model/#page/8
@karimkhanp
karimkhanp / test.tsv
Created December 18, 2016 05:35
Test data (numerical val prediction)
date hr_of_day vals
2014-05-01 0 0
2014-05-01 1 0
2014-05-01 2 0
2014-05-01 3 0
2014-05-01 4 0
2014-05-01 5 0
2014-05-01 6 0
2014-05-01 7 0
@karimkhanp
karimkhanp / train.tsv
Last active December 18, 2016 05:34
training data (Numerical val prediction)
date hr_of_day vals
2014-05-01 0 72
2014-05-01 1 127
2014-05-01 2 277
2014-05-01 3 411
2014-05-01 4 666
2014-05-01 5 912
2014-05-01 6 1164
2014-05-01 7 1119
2014-05-01 8 951
@karimkhanp
karimkhanp / chatbot.txt
Created November 24, 2016 11:00
Steps on how to build chatbot
What is Chatbot?
Types of Chatbot - Open domain vs Closed domain, Rules based vs General AI
Different approaches to build chatbots
Existing frameworks
Machine learning and NLP Based
AIML (Artificial Intelligence Markup Language)
How does each approach works.
IBM Watson, The most intelligent chatbot - Introduction
Modules to build open domain chatbot using NLP and Machine learning
Question Analysis
@karimkhanp
karimkhanp / svm-vs-nn
Created October 19, 2016 09:46
SVM vs NN - Why NN works better then SVM?
What is difference between SVM and Neural Network? Is it true that linear svm is same NN, and for non-linear separable problems, NN uses adding hidden layers and SVM uses changing space dimensions?
There are two parts to this question. The first part is "what is the form of function learned by these methods?" For NN and SVM this is typically the same. For example, a single hidden layer neural network uses exactly the same form of model as an SVM. That is:
Given an input vector x, the output is: output(x) = sum_over_all_i weight_i * nonlinear_function_i(x)
Generally the nonlinear functions will also have some parameters. So these methods need to learn how many nonlinear functions should be used, what their parameters are, and what the value of all the weight_i weights should be.
Therefore, the difference between a SVM and a NN is in how they decide what these parameters should be set to. Usually when someone says they are using a neural network they mean they are trying to find the parameters which minimiz