Skip to content

Instantly share code, notes, and snippets.

View karimkhanp's full-sized avatar

Karimkhan karimkhanp

View GitHub Profile
@karimkhanp
karimkhanp / learning
Last active October 13, 2017 09:33
Learning
MLP :
svm - https://en.wikipedia.org/wiki/Support_vector_machine
neural network - https://www.quora.com/What-is-a-simple-explanation-of-how-artificial-neural-networks-work-1/answer/Annalyn-Ng?srid=XpXu
- https://www.quora.com/What-is-a-simple-explanation-of-how-artificial-neural-networks-work-1/answer/Chris-Nicholson-1?srid=XpXu
perceptron model
k-mean
naive bayes -
https://web.stanford.edu/class/cs124/lec/naivebayes.pdf
http://stackoverflow.com/questions/10059594/a-simple-explanation-of-naive-bayes-classification
http://deeplearning4j.org/sentiment_analysis_word2vec.html
@karimkhanp
karimkhanp / sentiment analysis_full
Last active May 23, 2018 20:52
sentiment analysis resources combined
Supervised - Machine learning based
Unsupervised - Lexicon based
English lang:
http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#lexicon
http://www.cs.uic.edu/~liub/FBS/opinion-lexicon-English.rar - lexicon sentiment dictionary
http://www.cs.uic.edu/~liub/FBS/opinion-lexicon-English.rar
https://sites.google.com/site/datascienceslab/projects/multilingualsentiment
@karimkhanp
karimkhanp / worker
Created February 5, 2015 06:05
gearman worker 1 multiple worker
import gearman
gm_worker = gearman.GearmanWorker(['localhost:4730'])
def task_listener_reverse(gearman_worker, gearman_job):
print 'Reversing string: ' + gearman_job.data
return gearman_job.data[::-1]
# gm_worker.set_client_id is optional
@karimkhanp
karimkhanp / client
Created February 5, 2015 06:04
gearman client multiple worker
import json
import gearman
import time
import sys
def check_request_status(job_request):
if job_request.complete:
#print len(job_request.result)
#data = json.loads(job_request.result)
print "Job %s finished! Result: %s - %s" % (job_request.job.unique, job_request.state, job_request.result)
elif job_request.timed_out:
@karimkhanp
karimkhanp / dictionary
Last active August 29, 2015 14:09
dictiona list
http://scrapmaker.com/home
http://www.momswhothink.com/reading/list-of-verbs.html
@karimkhanp
karimkhanp / bigdata_resource
Last active April 6, 2022 13:48
Bigdata resources - Do I miss something. Add and make it richer
Bigdata is like combination of bunch of subjects. Mainly require programming, analysis, nlp, MLP, mathematics.
To see links, Go : http://www.quora.com/What-are-some-good-sources-to-learn-big-data
Here are bunch of courses I came accross:
Introduction to CS Course
Notes: Introduction to Computer Science Course that provides instructions on coding.
Online Resources:
Udacity - intro to CS course,
Coursera - Computer Science 101
@karimkhanp
karimkhanp / datasets
Created August 28, 2014 05:14
This gist provides you various free data source for data processing
https://www.cia.gov/library/publications/download/
http://flowingdata.com/2009/10/01/30-resources-to-find-the-data-you-need/
@karimkhanp
karimkhanp / nlp_defs
Last active August 29, 2015 14:05
Important terminologies for mlp stuff
Freebase - is a large collaborative knowledge base consisting of metadata composed mainly by its community members. It is an online collection of structured data harvested from many sources, including individual 'wiki' contributions
-> The MQL Read and MQL Write APIs provides access to the Freebase database using the Metaweb query language (MQL).
DBpedia - (from "DB" for "database") is a project aiming to extract structured content from the information created as part of the Wikipedia project. This structured information is then made available on the World Wide Web.[1] DBpedia allows users to query relationships and properties associated with Wikipedia resources, including links to other related datasets
-> Data is accessed using an SQL-like query language for RDF called SPARQL. For example, imagine you were interested in the Japanese shōjo manga series Tokyo Mew Mew, and wanted to find the genres of other works written by its illustrator. DBpedia combines information from Wikipedia's entries on
@karimkhanp
karimkhanp / ann
Created August 19, 2014 09:02
neural network concept and example
An artificial neural network is an interconnected group of nodes, akin to the vast network of neurons in a brain. Here, each circular node represents an artificial neuron and an arrow represents a connection from the output of one neuron to the input of another.
Example: character recornization
Make the system learn by 10 sample of 1-10 digits
While learning we will collect pixel positions for each digits.
Like if are learning digit '1'. So for each of 10 test, we will collect pixel position. We can store normalized mean value for digit 1 now.
Suppose now new digit comes and we want to identify it. So we will calculate the euclidian distance for input digit to all database learned digits. For which every euclidian distance is least, that is predicted digit.
@karimkhanp
karimkhanp / terminologies
Last active January 11, 2019 11:11
Dumping all terminologies, tool and technology required for BigData
-------------------------------------------------------- Edit to Enlarge ----------------------------------------------
Apache spark - Apache Spark is an open-source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley.[1] Spark fits into the Hadoop open-source community, building on top of the Hadoop Distributed File System (HDFS).[2] However, Spark is not tied to the two-stage MapReduce paradigm, and promises performance up to 100 times faster than Hadoop MapReduce for certain applications.
Database pipelining - http://www.tuplejump.com/img/ff08.theplatform.png
As you will notice it's just not about processing the data, but involves a lot of other components. Collection, storage, exploration, ML and visualization are critical to the proect's success.
SOLR - Solr to build a highly scalable data analytics engine to enable customers to engage in lightning fast, real-time knowledge discovery.