# Clone llama.cpp | |
git clone https://github.com/ggerganov/llama.cpp.git | |
cd llama.cpp | |
# Build it | |
make clean | |
LLAMA_METAL=1 make | |
# Download model | |
export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin |
Flame graphs are a nifty debugging tool to determine where CPU time is being spent. Using the Java Flight recorder, you can do this for Java processes without adding significant runtime overhead.
Shivaram Venkataraman and I have found these flame recordings to be useful for diagnosing coarse-grained performance problems. We started using them at the suggestion of Josh Rosen, who quickly made one for the Spark scheduler when we were talking to him about why the scheduler caps out at a throughput of a few thousand tasks per second. Josh generated a graph similar to the one below, which illustrates that a significant amount of time is spent in serialization (if you click in the top right hand corner and search for "serialize", you can see that 78.6% of the sampled CPU time was spent in serialization). We used this insight to spee
#!/bin/bash | |
# Run this on This AMI on AWS: | |
# https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#LaunchInstanceWizard:ami=ami-b36981d8 | |
# You should get yourself a fully working GPU enabled tensorflow installation. | |
cd ~ | |
# grab cuda 7.0 |
By @dmvaldman
Functional Reactive Programming (FRP) is generating buzz as an alternative to Object Oriented Programming (OOP) for certain use cases. However, an internet search quickly leads a curious and optimistic reader into the rabbit-hole of monads, functors, and other technical jargon. I’ve since emerged from this dark and lonely place with the realization that these words are mere implementation details, and that the core concepts are far more universal. In fact, the groundwork was laid down many centuries before the first computer, and has more to do with interpretations of reality, than structuring programs. Allow me to explain.
There’s an old thought experiment that goes like this:
/** | |
* This file contains the core idea of wrapping an underlying OutputFormat with an OutputFormat | |
* with an augmented key that writes to partitions using MultipleOutputs (or something similar) | |
*/ | |
package model.hadoop | |
import model.hadoop.HadoopIO.MultipleOutputer | |
import model.hadoop.HadoopIO.MultipleOutputer._ | |
import org.apache.hadoop.io.{DataInputBuffer, NullWritable} |
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# The unreasonable effectiveness of Character-level Language Models\n", | |
"## (and why RNNs are still cool)\n", | |
"\n", | |
"###[Yoav Goldberg](http://www.cs.biu.ac.il/~yogo)\n", |
""" | |
This is a batched LSTM forward and backward pass | |
""" | |
import numpy as np | |
import code | |
class LSTM: | |
@staticmethod | |
def init(input_size, hidden_size, fancy_forget_bias_init = 3): |
- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t