This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@ECHO OFF | |
REM recursively traverse through directories and delete all instances of JPG|PNG|TIF|JP2 image files | |
CHOICE /C:12345 /M "Really delete all images of type (1) JPG, (2) JP2, (3) TIF, (4) PNG or (5) Cancel?" | |
IF ERRORLEVEL 5 GOTO Cancel | |
IF ERRORLEVEL 4 GOTO PNG | |
IF ERRORLEVEL 3 GOTO TIF | |
IF ERRORLEVEL 2 GOTO JP2 | |
IF ERRORLEVEL 1 GOTO JPG | |
GOTO END | |
:JPG |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Usage: | |
# ./eunp-img-conversion.sh input.tif temp.tif output.jp2 | |
# 1. Invoke GraphicsMagick command line to convert master images to uncompressed 150ppi TIF with unsharp mask | |
# 2. Invoke Kakadu kdu_compress command line to convert uncompressed TIF to JP2000 | |
gm convert $1 -resample 150x150 -unsharp 1.5 -compress None ptif:$2 | kdu_compress -i $2 -o $3 -rate 1.0,0.84,0.7,0.6,0.5,0.4,0.35,0.3,0.25,0.21,0.18,0.15,0.125,0.1,0.088,0.075,0.0625,0.05,0.04419,0.03716,0.03125,0.025,0.0221,0.01858,0.015625 Clevels=6 Stiles=\{1024,1024\} Cmodes=\{BYPASS\} Corder=RLCP Cblk=\{64,64\} -no_palette |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
register './warcbase_kb/target/warcbase-0.1.0-SNAPSHOT-fatjar.jar'; | |
raw = load '/tmp/IAH-20080430204825-00000-blackbook.arc.gz' using | |
org.warcbase.pig.ArcLoader() as (url: chararray, date:chararray, mime:chararray, content:chararray); | |
a = foreach raw generate url,mime,content,SUBSTRING(date,0,12) as date,org.warcbase.pig.piggybank.DetectMimeType(content) as tikaMime; | |
b = filter a by (tikaMime == 'text/html'); | |
c = foreach b generate url,mime,tikaMime,date,org.warcbase.pig.piggybank.ExtractRawText(content) as txt; | |
d = foreach c generate url,mime,tikaMime,date,org.warcbase.pig.piggybank.DetectLanguage(txt) as lang; | |
e = group d by (lang,date); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
BufferedReader getReader (String fileUrl) throws IOException { | |
InputStreamReader reader; | |
try { | |
reader = new FileReader(fileUrl); | |
} | |
catch (FileNotFoundException e) { | |
// try a real URL instead | |
URL url = new URL(fileUrl); | |
reader = new InputStreamReader (url.openStream()); | |
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import org.apache.commons.lang.StringUtils; | |
import java.text.DecimalFormat; | |
double ld = StringUtils.getLevenshteinDistance(text1, text2); | |
double avglen = ((double)text1.length()+(double)text2.length())/2.0; | |
double m = 1.0-(ld/avglen); | |
double normVal = (m<0)?0.0:m; | |
float f = (float) normVal * 100; | |
DecimalFormat s = new DecimalFormat("##.##"); | |
normalized_levenshtein_distance = s.format(f); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
List leftList = new ArrayList(); | |
List rightList = new ArrayList(); | |
String[] lines = csv.split("\n"); | |
for(line : lines) { | |
String[] urls = line.split("\""); | |
leftList.add(urls[1]); | |
rightList.add(urls[3]); | |
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/ruby | |
def rename(dir, map) | |
Dir.foreach(dir) do |filename| | |
next if filename =~ /^\.+$/ or File.directory?("#{dir}/#{filename}") | |
(entry, extension) = filename.sub("file", "").split(".") | |
entry.sub!(/^0+/, "") | |
if map[entry].nil? | |
raise "PROBLEM: no entry for file #{dir}/#{filename} with id #{entry}" | |
else |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Hadoop cluster start-up script | |
# | |
# 1. Format the namenode (only required on 1st start!) | |
# sudo -u hdfs hdfs namenode -format | |
# 2. Start HDFS | |
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done | |
# 3. Create the /temp directory |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Converts XML to JSON | |
// from: http://coursesweb.net/javascript/convert-xml-json-javascript_s2 | |
function XMLtoJSON() { | |
var me = this; // stores the object instance | |
// gets the content of an XML file and returns it in | |
me.fromFile = function(xml, rstr) { | |
// Creates an instance of a XMLHttpRequest object | |
var xhttp = (window.XMLHttpRequest) ? new XMLHttpRequest() : new ActiveXObject("Microsoft.XMLHTTP"); | |
// sets and sends the request for calling "xml" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FOR /R %%a IN (*.foo) DO python foo.py "%%a" > "%%~dpna.foo" |
OlderNewer