Skip to content

Instantly share code, notes, and snippets.

View FoobarProtocol's full-sized avatar
🤐
Stealth

Foobar Protocol FoobarProtocol

🤐
Stealth
View GitHub Profile
@FoobarProtocol
FoobarProtocol / Convert Alpaca to Evol Dataset.py
Created October 21, 2023 23:46
This brief piece of code outlines how to convert an alpca
def convert_alpaca_to_evol(
file_path: str,
lines: bool = False,
output_file: str = "converted_alpaca.json"
):
"""Convert the Instruction/Input/Output format of Alpaca Instruct datasets
to the Evol-Instruct format of Instruction/Output. Inputs are appended to the
instructions.
Args:
@FoobarProtocol
FoobarProtocol / main_evol_instruct.py
Created October 21, 2023 23:45
This is the `main.py` from the Evol_Instruct repo that got removed by the researchers (this makes calls to the different transformative prompts; this is meant to be separated from the Auroboros technique)
import json
import random
from openai_access import call_chatgpt
from depth import createConstraintsPrompt, createDeepenPrompt, createConcretizingPrompt, createReasoningPrompt
from breadth import createBreadthPrompt
fr = open('alpaca_data.json','r')
@FoobarProtocol
FoobarProtocol / convert_to_conversation.py
Created October 21, 2023 23:45
This script does exactly what the name suggests & converts the instruction to conversation
import re
import json
import uuid
inputs = [json.loads(line) for line in open("instructions.jsonl").readlines()]
def split_response(instruction, response):
if '</s>' not in response:
return [
{
"from": "human",
@FoobarProtocol
FoobarProtocol / OpenAI_MultiThreaded_Req.py
Created October 21, 2023 23:44
This is a script that allows for multiple threaded requests up at OpenAI so that we can create prompts from it within the pipeline
import openai
api_keys = ['api-key-1', 'api-key-2', 'api-key-3'] # Replace with your actual API keys
num_prompts = 1000
prompts_per_request = 100 # Adjust based on your needs
num_requests = 10
prompts = []
for i in range(num_requests):
@FoobarProtocol
FoobarProtocol / Flan-T5-XXL_ContextWindow.py
Created October 21, 2023 23:44
Flan T5 XXL normally has a fixed content window (512 tokens). This can be prohihbitive, especially when considering the plethora of tasks that this model is capable of performing. However one of the researchers from Google gave us the code for allowing the model to generate past the 512 `max_tokens` limit and understanding well beyond the 512 co…
from transformers import AutoTokenizer, BitsAndBytesConfig, AutoModelForSeq2SeqLM
model_id = "google/flan-t5-xxl"
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=False
)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id, quantization_config=quantization_config)
@FoobarProtocol
FoobarProtocol / cURL command for PromptSapper.sh
Created October 21, 2023 23:39
Only curl command that seems to work with the promptsapper UI at this point. The prompt in here should be repurposed and directed toward anothe API endpoint to test the efficacy of fine-tuned models (has a fairly complex Solidity-based prompt laden within).
curl -X "POST" "https://www.promptsapper.tech:8003/sapperpro/Explore" \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:120.0) Gecko/20100101 Firefox/120.0' \
-H 'Accept: */*' \
-H 'Accept-Language: en-US,en;q=0.5' \
-H 'Accept-Encoding: gzip, deflate, br' \
-H 'Referer: https://www.promptsapper.tech/' \
-H 'Origin: https://www.promptsapper.tech' \
-H 'Dnt: 1' \
-H 'Connection: keep-alive' \
-H 'Sec-Fetch-Dest: empty' \
@FoobarProtocol
FoobarProtocol / self_instruct.py
Created October 17, 2023 23:25
Full script; can't be called w/o other files where the classes are pulled from, but this gives you the whole pipeline (skeleton) of what this entails. Frankly this file is way too fucking big - code needs to be refactored so we're not reading 900+ lines of code in one file 💀
import aiohttp
import argparse
import asyncio
import backoff
import copy
import datetime
import faiss
import os
import json
import math
@FoobarProtocol
FoobarProtocol / gradio_options.py
Last active October 16, 2023 22:40
Bunch of additional options for gradio related to smart contracts ; basically porting the functionality of the OpenZeppelin Wizard directly to the API ecosystem
import gr
def smart_contract_prompt_builder(
contract_type,
state_variables,
functions,
visibility,
modifiers,
features,
access_control,
@FoobarProtocol
FoobarProtocol / Answer_LtM.py
Created October 7, 2023 15:32
This is an extraction of the 'least to most' method from the researcher's GitHub; there are still some alterations that need to be made but this extracts the necessary code, makes some fixes to it, ensures that all calls are made to defined functions etc.; pairing this with the correct dataset from the repo will allow us to provide solutions to …
# !/usr/bin/env python3
# _*_ coding:utf-8 _*_
import openai
import random
import time
import json
import re
def create_response( eng,prompt_input, max_tokens=256, temperature=0.0, stop=None):
if stop is None:
@FoobarProtocol
FoobarProtocol / ta_prompt_2.txt
Created August 15, 2023 23:06
This is the second ta_prompt provided by the organization, 'BigCode' for StarCoder
Below are a series of dialogues between various people and an AI technical assistant. The assistant tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable. The assistant is happy to help with code questions, and will do its best to understand exactly what is needed. It also tries to avoid giving false or misleading information, and it caveats when it isn’t entirely sure about the right answer. That said, the assistant is practical and really does its best, and doesn’t let caution get too much in the way of being useful.
The Starcoder models are a series of 15.5B parameter models trained on 80+ programming languages from The Stack (v1.2) (excluding opt-out requests). The model uses Multi Query Attention, was trained using the Fill-in-the-Middle objective and with 8,192 tokens context window for a trillion tokens of heavily deduplicated data.
-----
Human: Who are you?
Assistant: My name is StarCoder, a language model developed by BigCode.