This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
spark2-shell --master yarn --executor-memory 8G --executor-cores 4 --driver-memory 16G --conf spark.dynamicAllocation.maxExecutors=64 --conf spark.executor.memoryOverhead=2048 --jars /srv/deployment/analytics/refinery/artifacts/refinery-job.jar,/srv/deployment/analytics/refinery/artifacts/refinery-hive.jar |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Three levels job->attemp->application to reach cassandra | |
The cassandra jar newly build - with exclusions - is on /tmp/oozie-nuria | |
hdfs dfs -rmr /tmp/oozie-nuria ; hdfs dfs -mkdir /tmp/oozie-nuria; hdfs dfs -put oozie/* /tmp/oozie-nuria; | |
Start job: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// spark2-shell --jars /srv/deployment/analytics/refinery/artifacts/refinery-job.jar | |
/** | |
* Use RefineTarget.find to find all Refine targets for an input (camus job) in the last N hours. | |
* Then filter for any for which the _REFINED_FAILED flag exists. | |
*/ | |
import import org.apache.hadoop.fs.Path | |
import org.joda.time.format.DateTimeFormatter | |
import com.github.nscala_time.time.Imports._ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// This is the EventStreams RecentChange stream endpoint | |
var url = 'https://stream.wikimedia.org/v2/stream/recentchange'; | |
// Use EventSource (available in most browsers, or as an | |
// npm module: https://www.npmjs.com/package/eventsource) | |
// to subscribe to the stream. | |
var recentChangeStream = new EventSource(url); | |
// Print each event to the console | |
recentChangeStream.onmessage = function(message) { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
curl -X POST 'http://localhost:8082/druid/v2/?pretty' -H 'Content-Type:application/json' -H 'Accept:application/json' -d '{ | |
"queryType":"segmentMetadata", | |
"dataSource":"wmf_netflow", | |
"intervals":["2019-09-01/2019-10-01"] | |
}' |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# From stat1004: | |
# pyspark2 --jars ~otto/spark-sql-kafka-0-10_2.11-2.3.1.jar,~otto/kafka-clients-1.1.0.jar | |
# Need spark-sql-kafka for DataStream source and kafka-clients for Kafka serdes. | |
from pyspark.sql.functions import * | |
from pyspark.sql.types import * | |
# Declare a Spark schema that matches the JSONData. | |
# In a future MEP world this would be automatically loaded | |
# from a JSONSchema. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
select CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) AS date, | |
count(1) as n_events | |
from event.externalguidance | |
where year=2019 and month=6 | |
and not useragent.is_bot | |
and event.action = 'init' | |
group by year, month, day | |
order by date | |
limit 1000000 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
import sys | |
import math | |
f = sys.argv[1] | |
_file = open(f) | |
data = {} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
set number | |
syntax enable | |
set cursorline | |
set showcmd | |
'show invisible chars' | |
set listchars=tab:→\ ,space:·,nbsp:␣,trail:•,eol:¶,precedes:«,extends:» | |
set list |