Dharmesh Kakadia dharmeshkakadia

Returns to Scale vs Experience

How many wristwatches are cheaper than Big Ben?

Abstract

Big machines are sometimes more efficient. But they cost more, so fewer can be produced with a finite budget. Small machines are cheaper and may benefit from improvement over time, driven by experience in building more units. When does this experience lead to greater overall efficiency? We derive an approximation which, given a learning rate, tells how much smaller a machine must be to overcome an initial efficiency disadvantage.

Background

Learning curves were characterized in the context of industrial production in the 1930s by Wright.[^1] The production cost of a machine follows a power law in the number of units made so far

Comparison of ZSH frameworks and plugin managers

Changelog

update 1: add a FAQ section
update 2: benchmark chart and feature comparison table
update 3:
- improve the table with missing features for antigen
- new zplg times result

TLDR

I've been deceiving you all. I had you believe that Svelte was a UI framework — unlike React and Vue etc, because it shifts work out of the client and into the compiler, but a framework nonetheless.

But that's not exactly accurate. In my defense, I didn't realise it myself until very recently. But with Svelte 3 around the corner, it's time to come clean about what Svelte really is.

Svelte is a language.

Specifically, Svelte is an attempt to answer a question that many people have asked, and a few have answered: what would it look like if we had a language for describing reactive user interfaces?

A few projects that have answered this question:

Spark Tips & Tricks

Misc. Tips & Tricks

If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding). Always err on the higher side w.r.t. number of partitions.
Pay particular attention to the number of partitions when using flatMap, especially if the following operation will result in high memory usage. The flatMap op usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output of flatMap to a number of partitions that will safely allow for appropriate partition memory sizes, based upon the

General Background and Overview

Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
Models and Issues in Data Stream Systems
Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
[Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t

Make it real

Ideas are cheap. Make a prototype, sketch a CLI session, draw a wireframe. Discuss around concrete examples, not hand-waving abstractions. Don't say you did something, provide a URL that proves it.

Ship it

Nothing is real until it's being used by a real user. This doesn't mean you make a prototype in the morning and blog about it in the evening. It means you find one person you believe your product will help and try to get them to use it.

Do it with style

Mac web developer apps

This gist's comment stream is a collection of webdev apps for OS X. Feel free to add links to apps you like, just make sure you add some context to what it does — either from the creator's website or your own thoughts.

— Erik

	import java.io.IOException;
	import java.net.URLClassLoader;
	import java.nio.file.Files;
	import java.nio.file.Paths;
	import java.nio.file.Path;

	/**
	* Example demonstrating a ClassLoader leak.
	*
	* <p>To see it in action, copy this file to a temp directory somewhere,

	#include "tweetnacl.h"

	#define FOR(i,n) for (i = 0;i < n;++i)
	#define sv static void

	typedef unsigned char u8;
	typedef unsigned int u32;
	typedef unsigned long long u64;
	typedef long long i64;
	typedef i64 gf[16];