Skip to content

Instantly share code, notes, and snippets.

View svartalf's full-sized avatar
💭
I may be slow to respond.

svartalf svartalf

💭
I may be slow to respond.
View GitHub Profile
@svartalf
svartalf / gist:5501335
Last active December 16, 2015 21:39
Сложный поиск по иркутскому твиттеру без спама, ботов, рекламы и политической хуеты.
#irkutsk OR #twirkutsk OR #irkru OR #Иркутск -HeepUriah -kareetc64 -BEM_Irkutsk -OWS633 -OWS337 -OWS334 -SimonDeMonfor -naidikvartiru -deSAD -myvotefactor -Dezirtir -RomanovPP -Ostaked -2Vitel -er_Irkutsk -avtonom_org -OWS557 -vumatin -optimumirk -inkadoru -er_Irkutsk -UDACHA_MARKET -deloirk -RRABBOTTA -BystroZaim -IrkutskDate -tvoemoehelp -irkutskyoutube -irk_fan_shop
@svartalf
svartalf / elasticsearch_ru_stemming_and_morphology.py
Last active January 13, 2022 12:21
Example of the ElasticSearch configuration for russian stemming and morphology
requests.put('http://localhost:9200/site/', data=json.dumps({
'settings': {
'analysis': {
'analyzer': {
'ru': {
'type': 'custom',
'tokenizer': 'standard',
"filter": ['lowercase', 'russian_morphology', 'english_morphology', 'ru_stopwords'],
},
},
@svartalf
svartalf / gist:2880645
Created June 6, 2012 08:26
PostgreSQL benchmark for database index on the Django' `auth_user.is_staff` field
database=> SELECT COUNT(*) FROM auth_user;
count
-------
1076
(1 row)
database=> EXPLAIN SELECT COUNT(*) FROM auth_user WHERE is_staff = False;
QUERY PLAN
-------------------------------------------------------------------
Aggregate (cost=26.80..26.81 rows=1 width=0)
@svartalf
svartalf / gist:2880639
Created June 6, 2012 08:23
MySQL benchmark for database index on the Django' `auth_user.is_staff` field
mysql> SELECT COUNT(*) FROM auth_user;
+----------+
| COUNT(*) |
+----------+
| 48158 |
+----------+
1 row in set (0.00 sec)
mysql> EXPLAIN SELECT COUNT(*) FROM auth_user WHERE is_staff = 1;
+----+-------------+-----------+------+---------------+------+---------+------+-------+-------------+
@svartalf
svartalf / timing.py
Created May 27, 2012 13:11
Test of the list and tuple __getitem__ performance
from __future__ import print_function
import sys
import timeit
l = list(range(10))
t = tuple(l)
def func(obj):
r = list(range(1000))
for i in r:
@svartalf
svartalf / pr_sort.py
Created May 10, 2012 23:35
Script for sorting csv file with a list of a domains with a Page Rank
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Page Rank CSV sorter
Usage:
pr_sort.py /path/to/csv-file.csv [/path/to/output-file.csv]
Output filename is not required, and if not supplied, output goes to the stdout.
"""