- Word Count
import sys
from operator import add
from pyspark import SparkContext
if __name__ == "__main__":
if len(sys.argv) != 2:
import sys
from operator import add
from pyspark import SparkContext
if __name__ == "__main__":
if len(sys.argv) != 2:
-- This is a Hive program. Hive is an SQL-like language that compiles | |
-- into Hadoop Map/Reduce jobs. It's very popular among analysts at | |
-- Facebook, because it allows them to query enormous Hadoop data | |
-- stores using a language much like SQL. | |
-- Our logs are stored on the Hadoop Distributed File System, in the | |
-- directory /logs/randomhacks.net/access. They're ordinary Apache | |
-- logs in *.gz format. | |
-- | |
-- We want to pretend that these gzipped log files are a database table, |
This uses Twitter Bootstrap classes for CodeIgniter pagination.
Drop this file into application/config
.
(by @andrestaltz)
So you're curious in learning this new thing called Reactive Programming, particularly its variant comprising of Rx, Bacon.js, RAC, and others.
Learning it is hard, even harder by the lack of good material. When I started, I tried looking for tutorials. I found only a handful of practical guides, but they just scratched the surface and never tackled the challenge of building the whole architecture around it. Library documentations often don't help when you're trying to understand some function. I mean, honestly, look at this:
Rx.Observable.prototype.flatMapLatest(selector, [thisArg])
Projects each element of an observable sequence into a new sequence of observable sequences by incorporating the element's index and then transforms an observable sequence of observable sequences into an observable sequence producing values only from the most recent observable sequence.
package object mail { | |
implicit def stringToSeq(single: String): Seq[String] = Seq(single) | |
implicit def liftToOption[T](t: T): Option[T] = Some(t) | |
sealed abstract class MailType | |
case object Plain extends MailType | |
case object Rich extends MailType | |
case object MultiPart extends MailType |
<?php if ( ! defined('BASEPATH')) exit('No direct script access allowed'); | |
/** | |
* Rating Library | |
* Using jQuery Raty plugin to rate products | |
* @author Nikola Katsarov | |
* @website http://katsarov.biz | |
*/ | |
class Rating { |
package botkop.sparti.receiver | |
import com.rabbitmq.client._ | |
import org.apache.spark.Logging | |
import org.apache.spark.storage.StorageLevel | |
import org.apache.spark.streaming.StreamingContext | |
import org.apache.spark.streaming.dstream.ReceiverInputDStream | |
import org.apache.spark.streaming.receiver.Receiver | |
import scala.reflect.ClassTag |
export SCALA_VERSION=scala-2.11.5 | |
sudo wget http://www.scala-lang.org/files/archive/${SCALA_VERSION}.tgz | |
sudo echo "SCALA_HOME=/usr/local/scala/scala-2.11.5" > /etc/profile.d/scala.sh | |
sudo echo 'export SCALA_HOME' >> /etc/profile.d/scala.sh | |
sudo mkdir -p /usr/local/scala | |
sudo -s cp $SCALA_VERSION.tgz /usr/local/scala/ | |
cd /usr/local/scala/ | |
sudo -s tar xvf $SCALA_VERSION.tgz | |
sudo rm -f $SCALA_VERSION.tgz | |
sudo chown -R root:root /usr/local/scala |
def ipToLong(ipAddress: String): Long = { | |
ipAddress.split("\\.").reverse.zipWithIndex.map(a=>a._1.toInt*math.pow(256,a._2).toLong).sum | |
} | |
def longToIP(long: Long): String = { | |
(0 until 4).map(a=>long / math.pow(256, a).floor.toInt % 256).reverse.mkString(".") | |
} |
/* | |
This example uses Scala. Please see the MLlib documentation for a Java example. | |
Try running this code in the Spark shell. It may produce different topics each time (since LDA includes some randomization), but it should give topics similar to those listed above. | |
This example is paired with a blog post on LDA in Spark: http://databricks.com/blog | |
Spark: http://spark.apache.org/ | |
*/ | |
import scala.collection.mutable |