Skip to content

Instantly share code, notes, and snippets.

View ashioyajotham's full-sized avatar
💭
Data! Data! Data!

Victor Jotham Ashioya ashioyajotham

💭
Data! Data! Data!
View GitHub Profile
@veekaybee
veekaybee / normcore-llm.md
Last active May 6, 2025 20:15
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@init27
init27 / app.py
Last active February 13, 2024 14:39
ArXiv Chat: Chat with the latest Arxiv papers
# Credit 🙏: I just used the example from langchain docs and it works quite well: https://python.langchain.com/en/latest/use_cases/question_answering.html
# Note 2: The Arxiv -> PDF logic is a bit messy, I'm sure it can be done better
# Note 3: Please install the following:
# To run:
# Save this in a `app.py`
# pip install arxiv PyPDF2 langchain chromadb
# The chat feature was shipped in H2O nightly this week, we will need to install from nightly link:
@rain-1
rain-1 / LLM.md
Last active April 8, 2025 13:49
LLM Introduction: Learn Language Models

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

@Monsieur-Chat
Monsieur-Chat / DNS_in_detail_WU.md
Last active March 5, 2023 14:06
Tryhackme | DNS in Detail | Walkground

TryHackMe | DNS in Detail | Writeup

TryHackMe - DNS in Detail - Writeup DNS in Detail is a room created by adamtlangley.

Answers/Flags :

What is DNS?

| Question | Answer |

'''
Tic-Tac-Toe Console Application
@author 'zenius lama'
@Version 3.5.3
@Since 2020-06-06
'''
from random import randrange
#
@MrEliptik
MrEliptik / text_preprocessing.py
Created January 14, 2019 12:01
A python script to preprocess text (remove URL, lowercase, tokenize, etc..)
import re, string, unicodedata
import nltk
import contractions
import inflect
from nltk import word_tokenize, sent_tokenize
from nltk.corpus import stopwords
from nltk.stem import LancasterStemmer, WordNetLemmatizer
def replace_contractions(text):
"""Replace contractions in string of text"""
@iwouldnot
iwouldnot / scrapper.py
Last active December 5, 2022 09:50
scrap lyrics from genius.com
import asyncio
import json
from time import time, sleep
import aiohttp
import pandas as pd
from bs4 import BeautifulSoup
from tqdm import tqdm
# Ващет СИКРЕТНА!!!!! Но код хранится в private репозитории, так что пох
@hugobowne
hugobowne / tweet_listener.py
Last active October 6, 2023 18:48
NOTE: this code is for a previous version of the Twitter API and I will not be updating in the near future. If someone else would like to, I'd welcome that! Feel free to ping me. END NOTE. Here I define a Tweet listener that creates a file called 'tweets.txt', collects streaming tweets as .jsons and writes them to the file 'tweets.txt'; once 100…
class MyStreamListener(tweepy.StreamListener):
def __init__(self, api=None):
super(MyStreamListener, self).__init__()
self.num_tweets = 0
self.file = open("tweets.txt", "w")
def on_status(self, status):
tweet = status._json
self.file.write( json.dumps(tweet) + '\n' )
self.num_tweets += 1