This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env perl | |
# Matt Post <[email protected]> | |
# April 2013 | |
# This script turns a single column of sorted text into LaTeX-formatted multiple columns spanning | |
# multiple PDF pages. | |
# Its input is a single column of text on STDIN (each line is a complete entry). Two optional | |
# arguments specify (a) the number of rows on the first page and (b) the number of rows on the | |
# remaining pages, with both defaulting to 45. The last page will be adjusted automatically. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl | |
# Returns the requested line number from a file or list of files. | |
# If the line number is given as i:j or i-j, selects that range. | |
# If no file is given, we read from STDIN. | |
my $arg = shift; | |
($num1,$split,$num2) = split(/([:\-\+])/,$arg); | |
die usage() unless $arg and (! $split or $num2); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
# Python *sucks* at UTF-8 (don't tell me "It's fixed in Python 3"; I don't care, plus no one uses Python 3) | |
# If you put this at the top of every Python script, however, it get rids of most of the headaches dealing with STDIN | |
# and STDOUT (basically, akin to "perl -C31"). I don't know if it's all necessary; I just know that if I put it at | |
# the top of my scripts, most of the problems go away, and I can stop thinking about it. | |
import sys | |
import codecs |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
""" | |
Looks at all the *.ics files in the current directory, removes the X- keys, | |
and generates a new UUID. This is used for restoring an accidentally-deleted | |
calendar in Apple's Calendar program; it is a rewrite of the node.js version | |
that is linked to from here: | |
http://fokkezb.nl/2015/01/13/how-to-restore-a-deleted-icloud-calendar/ | |
""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# I can never remember syntax for GNU parallel | |
## Treat STDIN as a pool of commands to run, running the command for each, at most j in parallel | |
cat commands.txt | parallel -j 10 | |
## Download a long list of files in parallel | |
cat files.txt | parallel -j 10 wget -q {} | |
## Start 10 parallel instances of COMMAND with FLAGS. Feed STDIN in 10k blocks to these commands. Assemble the outputs in order (-k). | |
cat large_input.txt | parallel -j 10 --pipe -k --block-size 10m COMMAND FLAGS > output.txt |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
""" | |
This is code to take a trained Fairseq model and discard the ADAM optimizer state, | |
which is not needed at test time. It can reduce a model size by ~70%. | |
Original author: Brian Thompson | |
""" | |
from fairseq import checkpoint_utils |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
import sys | |
import sacremoses | |
def main(args): | |
"""Tokenizes, preserving tabs""" | |
mt = sacremoses.MosesTokenizer(lang=args.lang) | |
def tok(s): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
import sys | |
from sacremoses.normalize import MosesPunctNormalizer | |
def main(args): | |
normalizer = MosesPunctNormalizer(lang=args.lang, penn=args.penn) | |
for line in sys.stdin: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
# -*- coding: utf-8 -*- | |
# | |
# Copyright 2019--2021 Matt Post <[email protected]> | |
# | |
# Licensed under the Apache License, Version 2.0 (the "License"); | |
# you may not use this file except in compliance with the License. | |
# You may obtain a copy of the License at | |
# | |
# http://www.apache.org/licenses/LICENSE-2.0 |
OlderNewer