Skip to content

Instantly share code, notes, and snippets.

@kumarbhrgv
kumarbhrgv / draft.md
Last active January 9, 2025 07:33
Diversity in Recommendation Systems

Improvising diversity of personalized recommendation systems

Recent Research papers:

  • Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques:

    107 Citations : IEEE Transactions on Knowledge and Data Engineering
    we introduce and explore a number of item ranking techniques that can generate substantially more diverse recommendations across all users while maintaining comparable levels of recommendation accuracy. Comprehensive empirical evaluation consistently shows the diversity gains of the proposed techniques using several real-world rating data sets and different rating prediction algorithms

  • Recommendation Diversification Using Explanations: (Data Engineering, 2009. ICDE '09. IEEE 25th International Conference)

Traditionally, the problem is addressed through attribute-based diversification grouping items in the result set that share many common attributes (e.g., genre for movies) and selecting only a limited number of items from each group. It is, however,

@lambdalisue
lambdalisue / jupyterhub
Last active July 24, 2022 01:41
A service (init.d) script for jupyterhub
#! /bin/sh
### BEGIN INIT INFO
# Provides: jupyterhub
# Required-Start: $remote_fs $syslog
# Required-Stop: $remote_fs $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Start jupyterhub
# Description: This file should be used to construct scripts to be
# placed in /etc/init.d.
@bsweger
bsweger / useful_pandas_snippets.md
Last active April 4, 2025 21:20
Useful Pandas Snippets

Useful Pandas Snippets

A personal diary of DataFrame munging over the years.

Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)

@brianckeegan
brianckeegan / backbone_extractor.py
Last active August 2, 2023 19:26
Given a networkx graph containing weighted edges and a threshold parameter alpha, this code will return another networkx graph with the "backbone" of the graph containing a subset of weighted edges that fall above the threshold following the method in Serrano et al. 2008.
# Serrano, Boguna, Vespigani backbone extractor
# from http://www.pnas.org/content/106/16/6483.abstract
# Thanks to Michael Conover and Qian Zhang at Indiana with help on earlier versions
# Thanks to Clay Davis for pointing out an error
import networkx as nx
import numpy as np
def extract_backbone(g, weight='weight', alpha=.05):
backbone_graph = nx.Graph()

Build your own private, encrypted, open-source Dropbox-esque sync folder

Prerequisites:

  • One or more clients running a UNIX-like OS. Examples are given for Ubuntu 12.04 LTS, although all software components are available for other platforms as well (e.g. OS X). YMMV
  • A cheap Ubuntu 12.04 VPS with storage. I recommend Backupsy, they offer 250GB storage for $5/month. Ask Google for coupon codes.

Software components used:

  • Unison for file synchronization
  • EncFS for folder encryption
@thiagomarzagao
thiagomarzagao / mcq.py
Last active May 4, 2020 01:50
The Python script below implements the “Fightin’ Words” algorithm (see Monroe, B., Colaresi, M., Quinn, K. Fightin’ words: lexical feature selection and evaluation for identifying the content of political conflict. Political Analysis, 16(4), pp. 372-403). It takes as inputs word-frequency matrices. These matrices must be in CSV format. The first…
### FIGHTIN' WORDS (MCQ-2008)
### author: Thiago Marzagao
### contact: marzagao ddott 1 at osu ddott edu
import os
import sys
import pandas as pd
import numpy as np
from numpy import matrix as m
@securitytube
securitytube / ssid-sniffer-scapy-python.py
Created April 2, 2013 12:49
WLAN SSID Sniffer in Python using Scapy
#!/usr/bin/env python
from scapy.all import *
ap_list = []
def PacketHandler(pkt) :
if pkt.haslayer(Dot11) :
if pkt.type == 0 and pkt.subtype == 8 :
@migurski
migurski / merge-geojsons.py
Created September 21, 2012 03:43
Merge multiple GeoJSON files into one
from json import load, JSONEncoder
from optparse import OptionParser
from re import compile
float_pat = compile(r'^-?\d+\.\d+(e-?\d+)?$')
charfloat_pat = compile(r'^[\[,\,]-?\d+\.\d+(e-?\d+)?$')
parser = OptionParser(usage="""%prog [options]
Group multiple GeoJSON files into one output file.
@gotgenes
gotgenes / edgeswap.py
Created May 22, 2012 16:12
Edge swap graph.
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# Copyright (c) 2011-2012 Christopher D. Lasher
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
# "Software"), to deal in the Software without restriction, including
# without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and to
@mblondel
mblondel / lda_gibbs.py
Last active October 9, 2023 11:31
Latent Dirichlet Allocation with Gibbs sampler
"""
(C) Mathieu Blondel - 2010
License: BSD 3 clause
Implementation of the collapsed Gibbs sampler for
Latent Dirichlet Allocation, as described in
Finding scientifc topics (Griffiths and Steyvers)
"""