Skip to content

Instantly share code, notes, and snippets.

View zikani03's full-sized avatar
🏍️
Progress is intentional

Zikani Nyirenda Mwase zikani03

🏍️
Progress is intentional
View GitHub Profile
@zikani03
zikani03 / pentaho_query_reader.rs
Created July 18, 2017 11:12
Extract SQL queries from pentaho files with Rust
extern crate zip;
extern crate quick_xml;
extern crate html_entities;
use std::io::BufReader;
use std::fs::File;
use std::io::Read;
use std::ops::Deref;
use std::collections::BTreeMap;
use zip::read::ZipArchive;
NPR,Fresh Air,http://www.npr.org/rss/podcast.php?id=381444908
,Wait Wait... Don't Tell Me,http://www.npr.org/rss/podcast.php?id=344098539
,Bullseye with Jesse Thorn,http://npr.org/rss/podcast.php?id=510309
,On Point With Tom Ashbrook,http://www.npr.org/rss/podcast.php?id=510053
,Only A Game,http://www.npr.org/rss/podcast.php?id=510052
,Here & Now,http://www.npr.org/rss/podcast.php?id=510051
,Latino USA,http://www.npr.org/rss/podcast.php?id=510016
,Car Talk,http://www.npr.org/rss/podcast.php?id=510208
,Piano Jazz Shorts,http://www.npr.org/rss/podcast.php?id=510056
,From The Top,http://www.npr.org/rss/podcast.php?id=510026
@risicle
risicle / cache_chained_calculation.py
Created March 17, 2016 16:54
Django cache work-sharing using PostgreSQL advisory locks
from django.core.cache import cache
from django.core.cache.backends.base import DEFAULT_TIMEOUT
from django.db import connection , transaction
from hashlib import md5
def cache_chained_calculation(characteristic_str, calculate_function, timeout=DEFAULT_TIMEOUT, force_update=False):
"""
Attempt to obtain result of @calculate_function, represented by @characteristic_str, through cache or calling the
function. Should only allow one caller to be calculating the value at once (enforced using postgres advisory locks),
@kristopolous
kristopolous / hn_seach.js
Last active July 24, 2023 04:12
hn job query search
// Usage:
// Copy and paste all of this into a debug console window of the "Who is Hiring?" comment thread
// then use as follows:
//
// query(term | [term, term, ...], term | [term, term, ...], ...)
//
// When arguments are in an array then that means an "or" and when they are seperate that means "and"
//
// Term is of the format:
// ((-)text/RegExp) ( '-' means negation )
@tcollins
tcollins / -Spring-JPA-Dynamic-Query-With-Limit
Last active January 16, 2025 23:13
Spring Data JPA - Limit results when using Specifications without an unnecessary count query being executed
If you use the findAll(Specification, Pageable) method, a count query is first executed and then the
data query is executed if the count returns a value greater than the offset.
For what I was doing I did not need pageable, but simply wanted to limit my results. This is easy
to do with static named queries and methodNameMagicGoodness queries, but from my research (googling
for a few hours) I couldn't find a way to do it with dynamic criteria queries using Specifications.
During my search I found two things that helped me to figure out how to just do it myself.
1.) A stackoverflow question.
@rodricios
rodricios / summarize.py
Last active November 18, 2020 17:21
Flipboard's summarization algorithm, sort of
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
pip install networkx distance pattern
In Flipboard's article[1], they kindly divulge their interpretation
of the summarization technique called LexRank[2].
@smhanov
smhanov / dawg.py
Last active April 19, 2025 16:37
Use a DAWG as a map
#!/usr/bin/python3
# By Steve Hanov, 2011. Released to the public domain.
# Please see http://stevehanov.ca/blog/index.php?id=115 for the accompanying article.
#
# Based on Daciuk, Jan, et al. "Incremental construction of minimal acyclic finite-state automata."
# Computational linguistics 26.1 (2000): 3-16.
#
# Updated 2014 to use DAWG as a mapping; see
# Kowaltowski, T.; CL. Lucchesi (1993), "Applications of finite automata representing large vocabularies",
# Software-Practice and Experience 1993
@TheWaWaR
TheWaWaR / gunicorn_config.py
Last active June 28, 2022 11:31
Gunicorn configuration sample
import os
app = '{YOUR-WSGI-APPLICATION}'
# Sample Gunicorn configuration file.
#
# Server socket
#
# bind - The socket to bind.
@neolitec
neolitec / BasicAuthenticationFilter.java
Created February 12, 2014 11:09
HTTP Basic authentication Java filter
package com.neolitec.examples;
import org.apache.commons.codec.binary.Base64;
import org.apache.commons.lang.StringUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
@debasishg
debasishg / gist:8172796
Last active April 20, 2025 12:45
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t