frogcjn · January 19, 2025 12:51
diff --git a/prompts.yaml b/prompts.yaml
 true_label: "YES"
 false_label: "NO"
 true_keywords:
  - "YES"
  - "equivalent"

 false_keywords:
  - "NO"
  - "inequivalent"

 DCE:
  ZERO: |
    You are here to judge if two programs are functionally equivalent.
    Here equivalence means that, when run on the same input, the two programs always have the same program state at all corresponding points reachable by program execution.
    [Program 1]:
    {program_1_code}
    [Program 2]:
    {program_2_code}
    
    What I expect from your answer: 
    Do not output any thoughts, just the answer.
    Whether these two programs are equivalent or not. You should output {true_label} or {false_label} in the end.
  ZERO_COT: |
    You are here to judge if two programs are functionally equivalent.
    Here equivalence means that, when run on the same input, the two programs always have the same program state at all corresponding points reachable by program execution.
    [Program 1]:
    {program_1_code}
    [Program 2]:
    {program_2_code}
    
    What I expect from your answer: 
    1. Output any thhoughts you have about the process to the answer. You should have a clear reasoning explanation: Which aspects of the code indicate whether the kernels preserve the same computation?
    2. whether these two programs are equivalent or not. You should output {true_label} or {false_label} in the end.
  FEW: "What is a good name for a company that makes {product}?"
  FEW_COT: ""
 TVM:
  ZERO: |
    I have two CUDA kernels. I need to determine if these two kernels are functionally equivalent—that is, whether they produce identical results for all valid inputs. 
    Your task: Inspect the given CUDA kernel source codes. Determine if they are functionally equivalent. Both kernels should, in principle, compute the same mathematical result (neglecting floating point rounding errors) on all valid inputs, despite differing low-level optimizations.
    If equivalent, explain how their transformations differ (e.g., different block/thread configurations, different loop split factors) and why these differences do not change the final result. 
    If not equivalent, identify the parts of the code or transformations that alter the semantics, leading to potentially different outputs.
    What I will provide: 
    1. Two CUDA kernel implementations generated by TVM with different schedules. 
   
    Kernel A:
    {program_1_code}
    Kernel B:
    {program_2_code}

    What I expect from your answer: 
    Do not output any thoughts, just the answer.
    Whether these two programs are equivalent or not. You should output {true_label} or {false_label} in the end.
  ZERO_COT: |
    I have two CUDA kernels. I need to determine if these two kernels are functionally equivalent—that is, whether they produce identical results for all valid inputs. 
    Your task: Inspect the given CUDA kernel source codes. Determine if they are functionally equivalent. Both kernels should, in principle, compute the same mathematical result (neglecting floating point rounding errors) on all valid inputs, despite differing low-level optimizations.
    If equivalent, explain how their transformations differ (e.g., different block/thread configurations, different loop split factors) and why these differences do not change the final result. 
    If not equivalent, identify the parts of the code or transformations that alter the semantics, leading to potentially different outputs.
    What I will provide: 
    1. Two CUDA kernel implementations generated by TVM with different schedules. 
   
    Kernel A:
    {program_1_code}
    Kernel B:
    {program_2_code}

    What I expect from your answer: 
    1. Output any thhoughts you have about the process to the answer. You should have a clear reasoning explanation: Which aspects of the code indicate whether the kernels preserve the same computation?
    2. whether these two programs are equivalent or not. You should output {true_label} or {false_label} in the end.
  FEW: "What is a good name for a company that makes {product}?"  
  FEW_COT: ""
 STOKE:
  ZERO: "What is a good name for a company that makes {product}?"
  ZERO_COT: ""
  FEW: "What is a good name for a company that makes {product}?"
  FEW_COT: ""
 OJ_V:
  ZERO: "What is a good name for a company that makes {product}?"
  ZERO_COT: ""
  FEW: "What is a good name for a company that makes {product}?"
  FEW_COT: ""
 OJ_A:
  ZERO: "What is a good name for a company that makes {product}?"
  ZERO_COT: ""
  FEW: "What is a good name for a company that makes {product}?"
  FEW_COT: ""
 OJ_VA:
  ZERO: "What is a good name for a company that makes {product}?"
  ZERO_COT: ""
  FEW: "What is a good name for a company that makes {product}?"
  FEW_COT: ""
	true_label: "YES"
	false_label: "NO"
	true_keywords:
	- "YES"
	- "equivalent"

	false_keywords:
	- "NO"
	- "inequivalent"

	DCE:
	ZERO: \|
	You are here to judge if two programs are functionally equivalent.
	Here equivalence means that, when run on the same input, the two programs always have the same program state at all corresponding points reachable by program execution.
	[Program 1]:
	{program_1_code}
	[Program 2]:
	{program_2_code}

	What I expect from your answer:
	Do not output any thoughts, just the answer.
	Whether these two programs are equivalent or not. You should output {true_label} or {false_label} in the end.
	ZERO_COT: \|
	You are here to judge if two programs are functionally equivalent.
	Here equivalence means that, when run on the same input, the two programs always have the same program state at all corresponding points reachable by program execution.
	[Program 1]:
	{program_1_code}
	[Program 2]:
	{program_2_code}

	What I expect from your answer:
	1. Output any thhoughts you have about the process to the answer. You should have a clear reasoning explanation: Which aspects of the code indicate whether the kernels preserve the same computation?
	2. whether these two programs are equivalent or not. You should output {true_label} or {false_label} in the end.
	FEW: "What is a good name for a company that makes {product}?"
	FEW_COT: ""
	TVM:
	ZERO: \|
	I have two CUDA kernels. I need to determine if these two kernels are functionally equivalent—that is, whether they produce identical results for all valid inputs.
	Your task: Inspect the given CUDA kernel source codes. Determine if they are functionally equivalent. Both kernels should, in principle, compute the same mathematical result (neglecting floating point rounding errors) on all valid inputs, despite differing low-level optimizations.
	If equivalent, explain how their transformations differ (e.g., different block/thread configurations, different loop split factors) and why these differences do not change the final result.
	If not equivalent, identify the parts of the code or transformations that alter the semantics, leading to potentially different outputs.
	What I will provide:
	1. Two CUDA kernel implementations generated by TVM with different schedules.

	Kernel A:
	{program_1_code}
	Kernel B:
	{program_2_code}

	What I expect from your answer:
	Do not output any thoughts, just the answer.
	Whether these two programs are equivalent or not. You should output {true_label} or {false_label} in the end.
	ZERO_COT: \|
	I have two CUDA kernels. I need to determine if these two kernels are functionally equivalent—that is, whether they produce identical results for all valid inputs.
	Your task: Inspect the given CUDA kernel source codes. Determine if they are functionally equivalent. Both kernels should, in principle, compute the same mathematical result (neglecting floating point rounding errors) on all valid inputs, despite differing low-level optimizations.
	If equivalent, explain how their transformations differ (e.g., different block/thread configurations, different loop split factors) and why these differences do not change the final result.
	If not equivalent, identify the parts of the code or transformations that alter the semantics, leading to potentially different outputs.
	What I will provide:
	1. Two CUDA kernel implementations generated by TVM with different schedules.

	Kernel A:
	{program_1_code}
	Kernel B:
	{program_2_code}

	What I expect from your answer:
	1. Output any thhoughts you have about the process to the answer. You should have a clear reasoning explanation: Which aspects of the code indicate whether the kernels preserve the same computation?
	2. whether these two programs are equivalent or not. You should output {true_label} or {false_label} in the end.
	FEW: "What is a good name for a company that makes {product}?"
	FEW_COT: ""
	STOKE:
	ZERO: "What is a good name for a company that makes {product}?"
	ZERO_COT: ""
	FEW: "What is a good name for a company that makes {product}?"
	FEW_COT: ""
	OJ_V:
	ZERO: "What is a good name for a company that makes {product}?"
	ZERO_COT: ""
	FEW: "What is a good name for a company that makes {product}?"
	FEW_COT: ""
	OJ_A:
	ZERO: "What is a good name for a company that makes {product}?"
	ZERO_COT: ""
	FEW: "What is a good name for a company that makes {product}?"
	FEW_COT: ""
	OJ_VA:
	ZERO: "What is a good name for a company that makes {product}?"
	ZERO_COT: ""
	FEW: "What is a good name for a company that makes {product}?"
	FEW_COT: ""