Skip to content

Instantly share code, notes, and snippets.

View kyrsideris's full-sized avatar
🚀
On the highway of data...

Kyriakos Sideris kyrsideris

🚀
On the highway of data...
View GitHub Profile
@amscotti
amscotti / md5.coffee
Last active January 18, 2021 12:54
MD5 hashing
crypto = require('crypto');
#Quick MD5 of text
text = "MD5 this text!"
md5hash1 = crypto.createHash('md5').update(text).digest("hex")
#MD5 of text with updates
m = crypto.createHash('md5')
m.update("MD5 ")
m.update("this ")
@delagoya
delagoya / test.avdl
Last active July 27, 2023 19:55
Avro IDL comments to Avrodoc example 2
@namespace("com.example")
/**
This is a comment for the whole protocol
*/
protocol Example {
/**
The comment applies to the `NoSpaces` record, but is not indented to the
@protrolium
protrolium / ffmpeg.md
Last active November 12, 2024 21:27
ffmpeg guide

ffmpeg

Converting Audio into Different Formats / Sample Rates

Minimal example: transcode from MP3 to WMA:
ffmpeg -i input.mp3 output.wma

You can get the list of supported formats with:
ffmpeg -formats

You can get the list of installed codecs with:

@kyrsideris
kyrsideris / jointesting.py
Created January 8, 2016 00:24
This script exercises the join as well as left, right and full outer join as implemented in Apache Spark. Employee and Department tables were inspired by the examples in wiki's articles on join operation: https://en.wikipedia.org/wiki/Join_(SQL)
"""
This script exercises the join as well as left, right and full outer join as implemented in
Apache Spark. Employee and Department tables were inspired by the examples in wiki's articles
on join operation: https://en.wikipedia.org/wiki/Join_(SQL)
Employee
(31, 'Rafferty')
(33, 'Jones')
(33, 'Heisenberg')
@yoyama
yoyama / Schema2CaseClass.scala
Created January 20, 2017 07:36
Generate case class from spark DataFrame/Dataset schema.
/**
* Generate Case class from DataFrame.schema
*
* val df:DataFrame = ...
*
* val s2cc = new Schema2CaseClass
* import s2cc.implicit._
*
* println(s2cc.schemaToCaseClass(df.schema, "MyClass"))
*
@marwei
marwei / how_to_reset_kafka_consumer_group_offset.md
Created November 9, 2017 23:39
How to Reset Kafka Consumer Group Offset

Kafka 0.11.0.0 (Confluent 3.3.0) added support to manipulate offsets for a consumer group via cli kafka-consumer-groups command.

  1. List the topics to which the group is subscribed
kafka-consumer-groups --bootstrap-server <kafkahost:port> --group <group_id> --describe

Note the values under "CURRENT-OFFSET" and "LOG-END-OFFSET". "CURRENT-OFFSET" is the offset where this consumer group is currently at in each of the partitions.

  1. Reset the consumer offset for a topic (preview)
@weaming
weaming / boostnote2md.py
Last active August 7, 2023 09:51
Convert boostnote cson format data to markdown
#!/usr/bin/env python3
# coding: utf-8
"""
Author : weaming
Created Time : 2018-05-26 21:32:59
Prerequisite:
python3 -m pip install cson arrow
"""
import json
import os
@davideicardi
davideicardi / README.md
Last active September 9, 2024 10:20
Write and read Avro records from bytes array

Avro serialization

There are 4 possible serialization format when using avro:

@rmoff
rmoff / docker-compose.yml
Last active June 2, 2024 17:14
Docker-Compose for Kafka and Zookeeper with internal and external listeners
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka: