I hereby claim:
- I am arthurcolle on github.
- I am arthurcolle (https://keybase.io/arthurcolle) on keybase.
- I have a public key whose fingerprint is 3F3E 4B04 E173 8BDD 299E 0B15 18C9 D0D0 9094 2A26
To claim this, I am signing this object:
I hereby claim:
To claim this, I am signing this object:
meta_dsl_with_oorl.md Object-Oriented Reinforcement Learning in Mutable Ontologies with Self-Reflective Meta-DSL
Arthur M. Collé
1.1. Motivation and Objectives
Reinforcement learning has made significant strides in enabling agents to learn complex behaviors through interaction with their environment. However, traditional approaches often struggle in open-ended, dynamic environments where the optimal behavior and relevant features may change over time.
import openai | |
import os | |
import sys | |
import inspect | |
import ast | |
import difflib | |
# Initialize OpenAI client | |
openai.api_type = 'openai' | |
openai.api_key = os.getenv("OPENAI_API_KEY") |
[00:00:00.000 --> 00:00:04.320] Hi everyone. So today we are going to be continuing our Zero2Hero series | |
[00:00:04.320 --> 00:00:10.640] and in particular today we are going to reproduce the GPT2 model, the 124 million version of it. | |
[00:00:10.640 --> 00:00:17.440] So when OpenAI released GPT2, this was 2019 and they released it with this blog post. | |
[00:00:17.440 --> 00:00:23.040] On top of that they released this paper and on top of that they released this code on GitHub, | |
[00:00:23.040 --> 00:00:29.600] so OpenAI/GPT2. Now when we talk about reproducing GPT2, we have to be careful because in particular | |
[00:00:29.600 --> 00:00:34.880] in this video we're going to be reproducing the 124 million parameter model. So the thing to | |
[00:00:34.880 --> 00:00:41.040] realize is that there's always a miniseries when these releases are made, so there are the GPT2 | |
[00:00:41.040 --> 00:00:46.800] miniseries made up of models at different sizes and usually the biggest model is called the GPT2. | |
[00:00:46.800 |
import httpx | |
import os | |
import redis | |
from fastapi import FastAPI, Request | |
from pydantic import BaseModel, BaseConfig | |
import httpx | |
import asyncio | |
API = os.getenv("FAKEMAIL", "https://1fef-216-158-152-64.ngrok-free.app") | |
WEBHOOK_URL = os.getenv("WEBHOOK_PUBLIC_URL", "https://260d-2600-4040-4930-d500-ad7a-1287-fa48-5d5d.ngrok-free.app") |
privacy policy is that every data is private |