You are a helpful assistant. You're smart, clever, direct and pragmatic. You notice details that a few people would. Be careful as the questions might attempt to misguide and tricky you. When answering to the User, you outline your thought process using these tags:
<thought> The root element that encapsulates an entire thought process.
<observation> Initial information or context that prompts the thinking process.
<question> The main query or problem to be addressed.
<hypothesis> An initial proposed explanation or solution.
<reasoning> Container for the logical steps of the thought process.
This file has been truncated, but you can view the full file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!DOCTYPE html> | |
<html lang="en"> | |
<head> | |
<meta charset="UTF-8"> | |
<meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
<title>Task Report</title> | |
<style> | |
body { | |
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
OPENROUTER_KEY=< your key here > | |
TASKS=/path/to/misguided.yaml | |
NAME=misguided | |
# Common | |
h bench judge meta-llama/llama-3.1-70b-instruct | |
h bench judge_api https://openrouter.ai/api | |
h bench judge_key $OPENROUTER_KEY | |
h bench tasks $TASKS |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- tags: [cheese] | |
question: Which cheese is nicknamed "King of Cheeses" but paradoxically has a rind resembling concrete? | |
criteria: | |
correctness: Answer mentions Parmigiano-Reggiano | |
bonus: Answer explains the paradox | |
- tags: [cheese] | |
question: What's the connection between a Norwegian brown cheese and caramel? | |
criteria: | |
correctness: Answer mentions caramelized milk sugars in any form |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Note that you're not expected to run this | |
# file as is in one go | |
OPENROUTER_KEY=<your_openrouter_key> | |
TASKS=<path_to_tasks_file> | |
NAME=engbench | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- tags: | |
- ori_mmlu-global_facts | |
question: >- | |
<instructions>Carefully read the question and the options provided. Choose | |
the option that best answers the question.</instructions> | |
<question>As of 2017, the share of deaths in Greenland by suicide is | |
about</question> | |
<options><option>A: 3.60%</option> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
parcelRequire=function(e,r,t,n){var i,o="function"==typeof parcelRequire&&parcelRequire,u="function"==typeof require&&require;function f(t,n){if(!r[t]){if(!e[t]){var i="function"==typeof parcelRequire&&parcelRequire;if(!n&&i)return i(t,!0);if(o)return o(t,!0);if(u&&"string"==typeof t)return u(t);var c=new Error("Cannot find module '"+t+"'");throw c.code="MODULE_NOT_FOUND",c}p.resolve=function(r){return e[t][1][r]||r},p.cache={};var l=r[t]=new f.Module(t);e[t][0].call(l.exports,p,l,l.exports,this)}return r[t].exports;function p(e){return f(p.resolve(e))}}f.isParcelRequire=!0,f.Module=function(e){this.id=e,this.bundle=f,this.exports={}},f.modules=e,f.cache=r,f.parent=o,f.register=function(r,t){e[r]=[function(e,r){r.exports=t},{}]};for(var c=0;c<t.length;c++)try{f(t[c])}catch(e){i||(i=e)}if(t.length){var l=f(t[t.length-1]);"object"==typeof exports&&"undefined"!=typeof module?module.exports=l:"function"==typeof define&&define.amd?define(function(){return l}):n&&(this[n]=l)}if(parcelRequire=f,i)throw i;return f}({ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
void main() { | |
// LISP Scope could be populated with reuqired values | |
// to provide interop between Dart and LISP | |
final baseScope = LispScope({ | |
'*': Multiplication('*'), | |
'+': Addition('+'), | |
'offset': OffsetContainer('offset'), | |
'print': Print('print'), | |
'call': CallMethod('call'), | |
}); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
~~>Meta | |
## Cheesy onions | |
@@ 240 Wettotter Harbor, 53320 | |
Local food supplier needs help unlocking a warehouse. | |
~~>Dialog | |
hello|Warehouse owner| Hey, you're here... You gotta help me! You gotta help me quick! 🙏 | |
what|You| Any rush? | |
-- | |
what|Warehouse owner| Actually... Yes. That boy did not get his raise... By... An occasion. |