Brando Miranda brando90

{
    "source": "...",
    "id": "...",
    "attributes": {
      "compression_ratio_zstd": 0.7
    }
}

Simplified version of Dually Ground BackTranslation for AutoFormalization:

def train_to_af_for_maf(mdl : causal_lm,
                        formal_data_set, # e.g., ITP lib like mathlib
                        informal_data_set,  # e.g., time-tested maths textbook e.g., Rudin, CLRS.
                        ):
    for (nl, fl*) in formal_data_set; for (nl*, fl) in informal_data_set;
        # -- Learn to Formalize: nl_i->fl* from fl* -> [nl_i]_i -> fl*
        [nl_i]_i := mdl("informalize " + fl*, sampling=top_p, num_out=k)  # noise is good for robustness!

---- Prompts for Assessing & Rubber Ducking discussions on your research plan according to Vectoring ----

references: - https://web.stanford.edu/class/cs197/slides/04-vectoring.pdf

We should have three prompts. One tackling a different way to mitigate risks and uncertainties in projects with uncertainties -- especially in computer science and machine learning research. They are:

Prompt 1: attempting to guess/extrapolate/identifying unknown unknowns -- to reduce risks we are unaware of.
Prompt 2: asses my vectoring plan with assumptions for effective tackling of the unknown i.e., research

	# # 'torch==2.1.2', # 2.2 not supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
	...
	# 'transformers>=4.40',
	# 'accelerate==0.29.2',
	...
	# 'datasets==2.14.7',
	# 'evaluate==0.4.1',
	# 'bitsandbytes== 0.43.0',
	# 'einops',
	# 'flash-attn>=2.5.8',

	from datasets import load_dataset
	from transformers import GPT2Tokenizer, GPT2LMHeadModel, Trainer, TrainingArguments, DataCollatorForLanguageModeling

	# Load dataset from a JSON file
	data_files = {"train": "path/to/your/train.json", "test": "path/to/your/test.json"}
	dataset = load_dataset("json", data_files=data_files)

	# Load pre-trained GPT-2 tokenizer and model
	tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
	model = GPT2LMHeadModel.from_pretrained("gpt2")

	# -- HELM prompt, 8 shot, CoT? ref: https://storage.googleapis.com/crfm-helm-public/lite/benchmark_output/runs/v1.0.0/math:subject=algebra,level=1,use_official_examples=False,use_chain_of_thought=True,model=01-ai_yi-34b/scenario_state.json, https://crfm.stanford.edu/helm/lite/latest/#/runs/math:subject=algebra,level=1,use_official_examples=False,use_chain_of_thought=True,model=01-ai_yi-34b
	HELM_MATH_PROMPT: str = (
	"""Given a mathematics problem, determine the answer. Simplify your answer as much as possible.###
	Problem: Let $r=3^s-s$ and $s=2^n+1$. What is the value of $r$ when $n=2$?
	Answer: First substitute $n=2$ into the expression for $s$ to find $s=2^2+1=5$. Then substitute $s=5$ into the expression for $r$ to find $r=3^5-5=243-5=\\boxed{238}.###
	Problem: If $x^{2y}= 4$ and $x = 4$, what is the value of $y$? Express your answer as a common fraction.
	Answer: Plugging $x = 4$ into the first equation, we get $4^{2y} = 4^1 \\Rightarrow 2y = 1 \\Rightarrow y = \\boxed{\\frac{1}{2}}.###
	Problem: If $y = \\dis

	# -- Prompt minerva MATH 3 - better minerva + cot/scratch_pad
	# https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/minerva_math/utils.py#L22
	H_MATH_MINERVA_PROMPT_TEMPLATE_3_COT = (
	r"""Problem:
	Find the domain of the expression $\frac{\sqrt{x-2}}{\sqrt{5-x}}$.}

	Solution:
	Let's think step by step. The expressions inside each square root must be non-negative. Therefore, $x-2 \ge 0$, so $x\ge2$, and $5 - x \ge 0$, so $x \le 5$. Also, the denominator cannot be equal to zero, so $5-x>0$, which gives $x<5$. Therefore, the domain of the expression is $\boxed{[2,5)}$.

	Problem:

	import math
	from pantograph.server import Server, ServerError
	from pantograph.expr import GoalState, TacticHave, TacticCalc, Tactic
	from loguru import logger
	from dataclasses import dataclass, field
	from typing import Optional, List, Tuple

	@dataclass(frozen=True)
	class SearchResult:
	"""The result of attempting to prove a theorem."""

	@inproceedings{langley00,
	author = {P. Langley},
	title = {Crafting Papers on Machine Learning},
	year = {2000},
	pages = {1207--1216},
	editor = {Pat Langley},
	booktitle = {Proceedings of the 17th International Conference
	on Machine Learning (ICML 2000)},
	address = {Stanford, CA},
	publisher = {Morgan Kaufmann}

	/-
	theorem: lim_{x -> c+} f(x) = +infinity

	x + infinit = +infinity

	lim_{x -> c} f(x) = L
	∀ ε > 0, ∃ δ > 0, 0 < \|x - c\| < δ → 0 < \|f(x) - L\| < ε

	L = + infinity
	consider some ε > 0