Sajid Rahman sajidrahman

🎯

Focusing

PhD student at UF 🐊, Research 🔛 Security🛡️∩ DL🧠 ∩ HCI 🧩

thomwolf / gpt-2-wikitext-103.py

Last active September 23, 2024 20:23

A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103

	# Copyright (c) 2019-present, Thomas Wolf.
	# All rights reserved. This source code is licensed under the MIT-style license.
	""" A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103 """
	import os
	from collections import namedtuple
	from tqdm import tqdm
	import torch
	import torch.nn as nn
	from torch.utils.data import DataLoader
	from ignite.engine import Engine, Events

bvasiles / stackoverflow.bib

Last active July 2, 2019 16:50

Bibliography of academic papers using Stack Overflow data, in BibTeX format. Ideally it should contain all papers listed on http://meta.stackoverflow.com/questions/134495/academic-papers-using-stack-overflow-data

	%% Saved with string encoding Unicode (UTF-8)

	@inproceedings{Gkotsis2014Content,
	title={It's all in the content: state of the art best answer prediction based on discretisation of shallow linguistic features},
	author={Gkotsis, George and Stepanyan, Karen and Pedrinaci, Carlos and Domingue, John and Liakata, Maria},
	booktitle={Proceedings of the 2014 ACM Conference on Web Science (WebSci)},
	pages={202--210},
	year={2014},
	organization={ACM}
	}

abelsonlive / lda.R

Created December 6, 2012 17:55

topic modeling in R

	# Brian Abelson @brianabelson
	# Harmony Institute
	# December 5, 2012

	# lda is a wrapper for lda.collapsed.gibbs.sampler in the "lda" package
	# it fits topic models using latent dirichlet allocation
	# it provides arguments for cleaning the input text and tuning the parameters of the model
	# it also returns alot of useful information about the topics/documents in a format that you can easily join back to your original data
	# this allows you to easily model outcomes based on the distribution of topics within a collection of texts