Skip to content

Instantly share code, notes, and snippets.

View micaleel's full-sized avatar
:octocat:
I may be slow to respond.

Khalil micaleel

:octocat:
I may be slow to respond.
View GitHub Profile

A Few Useful Things to Know about Machine Learning

The paper presents some key lessons and "folk wisdom" that machine learning researchers and practitioners have learnt from experience and which are hard to find in textbooks.

1. Learning = Representation + Evaluation + Optimization

All machine learning algorithms have three components:

  • Representation for a learner is the set if classifiers/functions that can be possibly learnt. This set is called hypothesis space. If a function is not in hypothesis space, it can not be learnt.
  • Evaluation function tells how good the machine learning model is.
  • Optimisation is the method to search for the most optimal learning model.
@micaleel
micaleel / elasticsearch-cheatsheet.txt
Created August 28, 2016 00:26 — forked from stephen-puiszis/elasticsearch-cheatsheet.txt
Elasticsearch Cheatsheet - An Overview of Commonly Used Elasticsearch API Endpoints and What They Do
# Elasticsearch Cheatsheet - an overview of commonly used Elasticsearch API commands
# cat paths
/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/indices
/_cat/indices/{index}
@micaleel
micaleel / rank_metrics.py
Created October 11, 2016 15:23 — forked from bwhite/rank_metrics.py
Ranking Metrics
"""Information Retrieval metrics
Useful Resources:
http://www.cs.utexas.edu/~mooney/ir-course/slides/Evaluation.ppt
http://www.nii.ac.jp/TechReports/05-014E.pdf
http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf
http://hal.archives-ouvertes.fr/docs/00/72/67/60/PDF/07-busa-fekete.pdf
Learning to Rank for Information Retrieval (Tie-Yan Liu)
"""
import numpy as np
@micaleel
micaleel / gist:da3c3df40ea0e374a20f6d897d4a8c1c
Created February 27, 2017 16:51 — forked from entaroadun/gist:1653794
Recommendation and Ratings Public Data Sets For Machine Learning

Movies Recommendation:

Music Recommendation:

@micaleel
micaleel / request_handler_test.py
Created April 28, 2017 16:24 — forked from didip/request_handler_test.py
Testing Tornado RequestHandlers
import unittest, os, os.path, sys, urllib
import tornado.database
import tornado.options
from tornado.options import options
from tornado.testing import AsyncHTTPTestCase
# add application root to sys.path
APP_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
sys.path.append(os.path.join(APP_ROOT, '..'))
@micaleel
micaleel / pmf
Created November 13, 2018 17:12 — forked from groverpr/pmf
pmf neural net fastai
ratings = pd.read_csv('ratings_small.csv') # loading data from csv
"""
ratings_small.csv has 4 columns - userId, movieId, ratings, and timestammp
it is most generic data format for CF related data
"""
val_indx = get_cv_idxs(len(ratings)) # index for validation set
wd = 2e-4 # weight decay
n_factors = 50 # n_factors - dimension of embedding matrix (D)