Skip to content

Instantly share code, notes, and snippets.

@saagarjha
saagarjha / hn_timestamps.py
Created August 6, 2019 16:25
Grab timestamps for all your Hacker News comments
#!/usr/bin/env python3
import json
import re
import urllib.request
if __name__ == "__main__":
comments = json.load(urllib.request.urlopen("https://hacker-news.firebaseio.com/v0/user/saagarjha.json"))["submitted"][::-1]
for comment in comments:
timestamp = json.load(urllib.request.urlopen("https://hacker-news.firebaseio.com/v0/item/" + str(comment) + ".json"))["time"]

Foreward

This document was originally written several years ago. At the time I was working as an execution core verification engineer at Arm. The following points are coloured heavily by working in and around the execution cores of various processors. Apply a pinch of salt; points contain varying degrees of opinion.

It is still my opinion that RISC-V could be much better designed; though I will also say that if I was building a 32 or 64-bit CPU today I'd likely implement the architecture to benefit from the existing tooling.

Mostly based upon the RISC-V ISA spec v2.0. Some updates have been made for v2.2

Original Foreword: Some Opinion

The RISC-V ISA has pursued minimalism to a fault. There is a large emphasis on minimizing instruction count, normalizing encoding, etc. This pursuit of minimalism has resulted in false orthogonalities (such as reusing the same instruction for branches, calls and returns) and a requirement for superfluous instructions which impacts code density both in terms of size and

@Mrnikbobjeff
Mrnikbobjeff / AvxNullFind.cs
Last active September 21, 2020 10:57
quick hacky way to find null references
using System;
using System.Runtime.CompilerServices;
using System.Runtime.Intrinsics;
using System.Runtime.Intrinsics.X86;
using System.Threading;
namespace ObjectPools
{
public class FastRefPool<T> where T : class
{
@AshKash
AshKash / maps.cpp
Last active June 16, 2019 03:35
C++ unordered_maps vs Go map benchmark
#include <unordered_map>
#include <string>
#include <vector>
const int numOfStrings = 100000;
const int numOfIterations = 1000;
// adapted from: https://medium.com/@griffinish/c-and-golang-neck-and-neck-on-maps-e7867adfadc6
// g++ -std=c++0x -O3 -o maps_cxx maps.cpp
//
// time ./maps_cxx
@usefulcat
usefulcat / filtervlan.rs
Created September 19, 2018 15:36
Rust program (well, most of it) that reads pcap data from stdin and writes same to one or more files and/or to stdout.
use std::io::{self, BufReader, BufWriter, Read, Write};
use std::slice;
use std::env;
use std::fs;
extern crate etherparse;
use etherparse::*;
mod pcap;
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@saagarjha
saagarjha / AppKit Abusers
Created May 21, 2018 05:38
Apps that have special-case workarounds in Apple's core frameworks, sorted by number of exceptions (from https://worthdoingbadly.com/appkitcompat/)
22 com.apple.iWork.Keynote
18 com.apple.iWork.Pages
16 com.apple.iWork.Numbers
15 com.apple.iPhoto
13 com.microsoft.Powerpoint
9 com.microsoft.Excel
9 com.apple.logic.pro
9 com.adobe.Photoshop
8 com.microsoft.Outlook
7 com.microsoft.Word

why doesn't radfft support AVX on PC?

So there's two separate issues here: using instructions added in AVX and using 256-bit wide vectors. The former turns out to be much easier than the latter for our use case.

Problem number 1 was that you positively need to put AVX code in a separate file with different compiler settings (/arch:AVX for VC++, -mavx for GCC/Clang) that make all SSE code emitted also use VEX encoding, and at the time radfft was written there was no way in CDep to set compiler flags for just one file, just for the overall build.

[There's the GCC "target" annotations on individual funcs, which in principle fix this, but I ran into nasty problems with this for several compiler versions, and VC++ has no equivalent, so we're not currently using that and just sticking with different compilation units.]

The other issue is to do with CPU power management.

@tclementdev
tclementdev / libdispatch-efficiency-tips.md
Last active April 10, 2025 19:06
Making efficient use of the libdispatch (GCD)

libdispatch efficiency tips

The libdispatch is one of the most misused API due to the way it was presented to us when it was introduced and for many years after that, and due to the confusing documentation and API. This page is a compilation of important things to know if you're going to use this library. Many references are available at the end of this document pointing to comments from Apple's very own libdispatch maintainer (Pierre Habouzit).

My take-aways are:

  • You should create very few, long-lived, well-defined queues. These queues should be seen as execution contexts in your program (gui, background work, ...) that benefit from executing in parallel. An important thing to note is that if these queues are all active at once, you will get as many threads running. In most apps, you probably do not need to create more than 3 or 4 queues.

  • Go serial first, and as you find performance bottle necks, measure why, and if concurrency helps, apply with care, always validating under system pressure. Reuse

@EgorBo
EgorBo / FindDups.cs
Created March 16, 2018 09:30
FindDups.cs
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
namespace ConsoleApp13
{
class Program
{