Skip to content

Instantly share code, notes, and snippets.

@dustalov
dustalov / lexer.rb
Last active August 26, 2024 21:05
Link Grammar for Russian (Parser of the Parser)
# encoding: utf-8
# Processor of Link Grammar for Russian output.
#
class LinkParser::Lexer
# This exception raises when link grammar is invalid and Lexer
# is unable to understand the output.
#
class InvalidLinkGrammar < RuntimeError
attr_reader :input
@dustalov
dustalov / invoke.py
Last active April 22, 2019 12:44
Tesuçk Invocation Script
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This program is free software. It comes without any warranty, to
# the extent permitted by applicable law. You can redistribute it
# and/or modify it under the terms of the Do What The Fuck You Want
# To Public License, Version 2, as published by Sam Hocevar. See
# http://sam.zoy.org/wtfpl/COPYING for more details.
import argparse
@dustalov
dustalov / opcorpora.rb
Last active December 28, 2015 18:38
Extract texts from the OpenCorpora XML dump.
#!/usr/bin/env ruby
# encoding: utf-8
require 'rubygems'
require 'nokogiri'
require 'csv'
Dir.mkdir 'opencorpora' unless File.directory? 'opencorpora'
buf, flag = '', false
@dustalov
dustalov / avgdegree.groovy
Last active December 28, 2015 23:09
Using Gephi Toolkit to compute the average degree of the given graph in the GraphML format.
#!/usr/bin/env groovy
import org.gephi.data.attributes.api.AttributeController
import org.gephi.graph.api.GraphController
import org.gephi.io.importer.api.EdgeDefault
import org.gephi.io.importer.api.ImportController
import org.gephi.io.processor.plugin.DefaultProcessor
import org.gephi.project.api.ProjectController
import org.gephi.statistics.plugin.Degree
import org.openide.util.Lookup
@dustalov
dustalov / expectation-maximization.rb
Last active August 29, 2015 14:21
EM-algorithm coin example
#!/usr/bin/env ruby
=begin
http://ai.stanford.edu/~chuongdo/papers/em_tutorial.pdf
http://stats.stackexchange.com/questions/72774/numerical-example-to-understand-expectation-maximization
http://math.stackexchange.com/questions/25111/how-does-expectation-maximization-work
http://math.stackexchange.com/questions/81004/how-does-expectation-maximization-work-in-coin-flipping-problem
http://www.youtube.com/watch?v=7e65vXZEv5Q
=end
@dustalov
dustalov / Makefile
Last active December 20, 2016 13:07
Listing generator for a computer program copyright registration.
all: clean lister latex view
clean:
latexmk -C -pdf
rm -f source.tex
lister:
./lister.rb
latex:
latexmk -pdf -pdflatex="xelatex %O %S" listing
view:
xdg-open listing.pdf
@dustalov
dustalov / badges.tex
Last active February 3, 2017 13:40
Badges for SPM2016.
\documentclass[11pt]{letter}
\usepackage[a4paper,landscape]{geometry}
\usepackage{polyglossia}
\setmainlanguage[babelshorthands=true]{russian}
\setotherlanguage{english}
\defaultfontfeatures{Ligatures=TeX,Mapping=tex-text}
@dustalov
dustalov / ruscorpora.rb
Last active April 20, 2016 21:15
Fetch sentences from the Russian National Corpus.
#!/usr/bin/env ruby
require 'net/http'
require 'uri'
require 'nokogiri'
Example = Struct.new(:text, :source)
def ruscorpora(word)
uri = URI('http://search.ruscorpora.ru/download-xml.xml')
@dustalov
dustalov / decoder.sh
Created September 13, 2016 20:25
A brute force decoder of Cyrillic strings with unknown charset combination.
#!/bin/bash -e
S=$(head -1)
CHARSETS=(utf8 cp1251 cp1252 koi8r koi8u iso-8859-5 maccyrillic)
for c1 in ${CHARSETS[*]}; do
for c2 in ${CHARSETS[*]}; do
for c3 in ${CHARSETS[*]}; do
for c4 in ${CHARSETS[*]}; do
echo -ne "$c1\t$c2\t$c3\t$c4\t"
<<<$S iconv -f=$c1 -t=$c2 -c | iconv -f=$c3 -t=$c4 -c
done
@dustalov
dustalov / ExtractRelations.java
Last active March 21, 2021 19:33
Extract semantic relations from Wiktionary using JWKTL.
import de.tudarmstadt.ukp.jwktl.JWKTL;
import de.tudarmstadt.ukp.jwktl.api.filter.WiktionaryEntryFilter;
import de.tudarmstadt.ukp.jwktl.api.util.Language;
import java.io.File;
import java.util.Locale;
public class ExtractRelations {
public static void main(String[] args) {
if (args.length != 1) {
System.err.println("Usage: java ExtractRelations.java database [filter]");