Skip to content

Instantly share code, notes, and snippets.

@bekce
bekce / simjoin.spark
Last active June 20, 2018 06:30
Similarity Join
//Spark Similarity Join Algorithm
//Author: Selim Eren Bekçe
//Date: 2015-12-22
var lines = sc.textFile("tweets10K.tsv",8).map(s => s.split("\t"))
lines = lines.filter( t => t.length == 2 )
var pairs = lines.flatMap( s => {
val rid = s(0) // record id
val text = s(1) // text to be tokenized
val tokens = text.split("[^\\p{L}\\p{Nd}/:]") // split on non-alphanumeric chars
@bekce
bekce / Solution.java
Created January 3, 2017 11:39
[scratchpad] some problem solutions
import java.io.*;
import java.util.*;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveAction;
import java.util.concurrent.RecursiveTask;
/**
* Created by seb on 11/9/2016.
*/
public class Solution {
@bekce
bekce / bash_history.sh
Last active November 10, 2021 21:26
OpenVZ setup (legacy)
yum update
yum -y update
yum -y install openssh-clients openssh-servers nano
yum -y install openssh-clients openssh-server nano
ifconfig
chkconfig sshd on
service sshd start
shutdown -h now
ifconfig
wget -P /etc/yum.repos.d/ http://ftp.openvz.org/openvz.repo
@bekce
bekce / README.md
Created February 13, 2017 11:46
Node.js IMAP client with parsing

This example connects to an imap server and retrieves emails

npm install imap mailparser

@bekce
bekce / README.md
Created February 21, 2017 13:36
ldap server with mysql backend

I wanted to build an LDAP server that queries a MySQL server to fetch users and check their passwords. It is mainly used for old software that does not work with custom OAuth2 providers. Redmine is an example of this.

Instructions:

  1. Create the database and table with insert.sql
public static BigInteger fibonacci(long n) {
if (n < 1) throw new IllegalArgumentException("n>=1 must hold");
BigInteger cur = BigInteger.ONE, t1 = BigInteger.ONE, t2 = BigInteger.ONE;
for (long i = 2; i < n; i++) {
cur = t1.add(t2);
t1 = t2;
t2 = cur;
}
return cur;
}
git filter-branch --env-filter '
OLD_EMAIL="[email protected]"
CORRECT_NAME="bekce"
CORRECT_EMAIL="[email protected]"
if [ "$GIT_COMMITTER_EMAIL" = "$OLD_EMAIL" ]
then
export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
fi
if [ "$GIT_AUTHOR_EMAIL" = "$OLD_EMAIL" ]
@bekce
bekce / PartitioningTest.scala
Last active June 7, 2017 14:24
Spark partitioning test with multiple RDDs
import java.lang.management.ManagementFactory
import java.net.InetAddress
import org.apache.spark.rdd.RDD
import org.apache.spark.{Partitioner, SparkContext}
import scala.runtime.ScalaRunTime
/**
* Note: Package as a jar and run with spark-submit against a running cluster.
* Created by bekce on 6/5/17.
*/
@bekce
bekce / SSLUtilities.java
Created February 7, 2018 11:16
Ignore SSL/TLS trust/certificate errors in Java. Call SSLUtilities.trustAllHttpsCertificates() at init
package utils;
import java.security.GeneralSecurityException;
import java.security.SecureRandom;
import java.security.cert.X509Certificate;
import javax.net.ssl.HostnameVerifier;
import javax.net.ssl.HttpsURLConnection;
import javax.net.ssl.SSLContext;
import javax.net.ssl.TrustManager;
import javax.net.ssl.X509TrustManager;
import logging
import logging.handlers
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)-10s %(message)s' )
smtp_handler = logging.handlers.SMTPHandler(mailhost=("localhost", 25),
fromaddr="[email protected]",
toaddrs=["[email protected]"],
subject=u"stuff failed")
mail_logger = logging.getLogger("mail")
mail_logger.addHandler(smtp_handler)