This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import ciir.jfoley.chai.time.Debouncer; | |
| import org.apache.lucene.analysis.en.EnglishAnalyzer; | |
| import org.apache.lucene.document.Field; | |
| import org.apache.lucene.document.StringField; | |
| import org.apache.lucene.document.TextField; | |
| import org.apache.lucene.index.IndexWriter; | |
| import org.apache.lucene.index.IndexWriterConfig; | |
| import org.apache.lucene.store.FSDirectory; | |
| import org.jsoup.Jsoup; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import socket | |
| host = socket.gethostname() | |
| onSydney=(host == 'sydney.cs.umass.edu') | |
| if onSydney: | |
| print('CRFSUITE:=/mnt/nfs/work3/jfoley/bin/crfsuite-0.12/bin/crfsuite') | |
| print("PREFIX=qsub -b y -cwd -sync y -l mem_free=8G -l mem_token=8G -o $@.out -e $@.err ") | |
| print("JAVA:=/mnt/nfs/work3/jfoley/bin/jdk1.8.0_31/bin/java -ea -Xmx7G") | |
| print("SUFFIX:=") # log file created through qsub | |
| else: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| public static double computeAP(List<Boolean> isTruePositiveFromRanking, int numRelevant) { | |
| // if there are no relevant documents, | |
| // the average is artificially defined as zero, to mimic trec_eval | |
| // Really, the output is NaN, or the query should be ignored [point of debate] | |
| if(numRelevant == 0) return 0; | |
| double sumPrecision = 0; | |
| int recallPointCount = 0; | |
| for (int i = 0; i < data.size(); i++) { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def ternary(bool, pos, neg): | |
| if bool: | |
| return pos | |
| else: | |
| return neg |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import numpy as np | |
| import torch | |
| from torch.nn.utils.rnn import pack_padded_sequence, pad_sequence | |
| def pack_lstm(items, lstm): | |
| N = len(items) | |
| reorder_args = np.argsort([len(it) for it in items])[::-1] | |
| origin_args = torch.from_numpy(np.argsort(reorder_args)) | |
| ordered = [items[i] for i in reorder_args] | |
| packed_items = pack_padded_sequence(pad_sequence(ordered, batch_first=True), [len(od) for od in ordered], batch_first=True) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import agent_ql as aq | |
| # Diaz, F. "Condensed List Relevance Models." (ICTIR 2015) | |
| def CLRM3(query, originalWeight=0.3): | |
| first_pass = aq.ql(aq.tokenize(query)) | |
| RM = aq.term_probability_model() | |
| for doc in first_pass.search_now(): | |
| RM += doc.to_term_probabilities() * doc.score | |
| return first_pass.results().re_rank( first_pass.mixture_model(RM, originalWeight) ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import java.util.Random; | |
| import java.util.Scanner; | |
| // We discussed academic honesty, so when you re-type this code, be sure to cite it in a comment! | |
| public class GuessingGame { | |
| /** | |
| * A Java program will run code in a special ``main`` method. | |
| * Note that Java has two types of comments: block (slash-star ... star-slash), and line ("slash-slash") comments. | |
| * For now we ignore args, which is an array of strings that the user might have passed in. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| package edu.smith.cs.csc262.coopsh.apps; | |
| import edu.smith.cs.csc262.coopsh.ShellEnvironment; | |
| import edu.smith.cs.csc262.coopsh.Task; | |
| /** | |
| * This is a full implementation of Echo. | |
| * @author jfoley | |
| * | |
| */ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import java.io.*; | |
| import java.util.ArrayList; | |
| import java.util.List; | |
| import java.util.stream.Collectors; | |
| import java.util.zip.ZipEntry; | |
| import java.util.zip.ZipFile; | |
| import java.util.zip.ZipOutputStream; | |
| public class ZipSplit { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| use std::convert::TryInto; | |
| use tantivy::{Searcher, Term}; | |
| #[derive(Debug, Clone, Serialize, Deserialize)] | |
| pub struct CountStats { | |
| pub collection_frequency: u64, | |
| pub document_frequency: u64, | |
| pub collection_length: u64, | |
| pub document_count: u64, | |
| } |