Ian ionox0

Using strace and lsof to debug blocked processes

You can use strace on a specific pid to figure out what a specific process is doing, e.g.:

strace -fp <pid>

You might see something like:

select(9, [3 5 8], [], [], {0, 999999}) = 0 (Timeout)

Stevey's Google Platforms Rant

I was at Amazon for about six and a half years, and now I've been at Google for that long. One thing that struck me immediately about the two companies -- an impression that has been reinforced almost daily -- is that Amazon does everything wrong, and Google does everything right. Sure, it's a sweeping generalization, but a surprisingly accurate one. It's pretty crazy. There are probably a hundred or even two hundred different ways you can compare the two companies, and Google is superior in all but three of them, if I recall correctly. I actually did a spreadsheet at one point but Legal wouldn't let me show it to anyone, even though recruiting loved it.

I mean, just to give you a very brief taste: Amazon's recruiting process is fundamentally flawed by having teams hire for themselves, so their hiring bar is incredibly inconsistent across teams, despite various efforts they've made to level it out. And their operations are a mess; they don't real

	Latency Comparison Numbers (~2012)
	----------------------------------
	L1 cache reference 0.5 ns
	Branch mispredict 5 ns
	L2 cache reference 7 ns 14x L1 cache
	Mutex lock/unlock 25 ns
	Main memory reference 100 ns 20x L2 cache, 200x L1 cache
	Compress 1K bytes with Zippy 3,000 ns 3 us
	Send 1K bytes over 1 Gbps network 10,000 ns 10 us
	Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD

	-- This is a Hive program. Hive is an SQL-like language that compiles
	-- into Hadoop Map/Reduce jobs. It's very popular among analysts at
	-- Facebook, because it allows them to query enormous Hadoop data
	-- stores using a language much like SQL.

	-- Our logs are stored on the Hadoop Distributed File System, in the
	-- directory /logs/randomhacks.net/access. They're ordinary Apache
	-- logs in *.gz format.
	--
	-- We want to pretend that these gzipped log files are a database table,


	FILE SPACING:

	# double space a file
	sed G

	# double space a file which already has blank lines in it. Output file
	# should contain no more than one blank line between lines of text.
	sed '/^$/d;G'

	# Staring FASTQ files
	export FQ1=1.fq
	export FQ2=2.fq

	# The names of the random subsets
	export FQ1SUBSET=1.rand.fq
	export FQ2SUBSET=2.rand.fq

	# How many random pairs do we want?
	export N=100