Skip to content

Instantly share code, notes, and snippets.

View ashioyajotham's full-sized avatar
💭
Data! Data! Data!

Victor Jotham Ashioya ashioyajotham

💭
Data! Data! Data!
View GitHub Profile
@ashioyajotham
ashioyajotham / LLM.md
Created October 3, 2023 21:37 — forked from rain-1/LLM.md
LLM Introduction: Learn Language Models

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

@ashioyajotham
ashioyajotham / normcore-llm.md
Created August 27, 2023 17:54 — forked from veekaybee/normcore-llm.md
Normcore LLM Reads
@ashioyajotham
ashioyajotham / text_preprocessing.py
Created May 18, 2023 16:22 — forked from MrEliptik/text_preprocessing.py
A python script to preprocess text (remove URL, lowercase, tokenize, etc..)
import re, string, unicodedata
import nltk
import contractions
import inflect
from nltk import word_tokenize, sent_tokenize
from nltk.corpus import stopwords
from nltk.stem import LancasterStemmer, WordNetLemmatizer
def replace_contractions(text):
"""Replace contractions in string of text"""
@ashioyajotham
ashioyajotham / app.py
Created April 6, 2023 22:35 — forked from init27/app.py
ArXiv Chat: Chat with the latest Arxiv papers
# Credit 🙏: I just used the example from langchain docs and it works quite well: https://python.langchain.com/en/latest/use_cases/question_answering.html
# Note 2: The Arxiv -> PDF logic is a bit messy, I'm sure it can be done better
# Note 3: Please install the following:
# To run:
# Save this in a `app.py`
# pip install arxiv PyPDF2 langchain chromadb
# The chat feature was shipped in H2O nightly this week, we will need to install from nightly link:
@ashioyajotham
ashioyajotham / scrapper.py
Created December 5, 2022 09:50 — forked from iwouldnot/scrapper.py
scrap lyrics from genius.com
import asyncio
import json
from time import time, sleep
import aiohttp
import pandas as pd
from bs4 import BeautifulSoup
from tqdm import tqdm
# Ващет СИКРЕТНА!!!!! Но код хранится в private репозитории, так что пох
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ashioyajotham
ashioyajotham / tweet_listener.py
Created March 18, 2022 11:46 — forked from hugobowne/tweet_listener.py
NOTE: this code is for a previous version of the Twitter API and I will not be updating in the near future. If someone else would like to, I'd welcome that! Feel free to ping me. END NOTE. Here I define a Tweet listener that creates a file called 'tweets.txt', collects streaming tweets as .jsons and writes them to the file 'tweets.txt'; once 100…
class MyStreamListener(tweepy.StreamListener):
def __init__(self, api=None):
super(MyStreamListener, self).__init__()
self.num_tweets = 0
self.file = open("tweets.txt", "w")
def on_status(self, status):
tweet = status._json
self.file.write( json.dumps(tweet) + '\n' )
self.num_tweets += 1
@ashioyajotham
ashioyajotham / cheatsheet.md
Created February 12, 2022 22:08 — forked from LKS90/cheatsheet.md
Cheatsheet for LaTex, using Markdown for markup. I use this with atom.io and markdown-preview-plus to write math stuff

Description

Cheatsheet for LaTex, using Markdown for markup. I use this with atom.io and 📦markdown-preview-plus to write math stuff. 📦keyboard-localization is necessary when using an international layout (like [swiss] german).

Further Reference and source: ftp://ftp.ams.org/pub/tex/doc/amsmath/short-math-guide.pdf

Example expressions / functions