Skip to content

Instantly share code, notes, and snippets.

@VictorTaelin
Created February 24, 2025 16:47
Show Gist options
  • Save VictorTaelin/12579a0fd04f3b5da0e251a0a8cb6a87 to your computer and use it in GitHub Desktop.
Save VictorTaelin/12579a0fd04f3b5da0e251a0a8cb6a87 to your computer and use it in GitHub Desktop.
prompt to generate _test_.js
the current dir has the following files:
bench_cnots.hvml
bench_count.hs
bench_count.hvml
bench_sum_range.hs
bench_sum_range.hvml
bench_sum_range.js
bug_tmp
docs
enum_bin.hvml
enum_coc_smart.hvml
enum_lam_naive_blc.hs
enum_lam_naive_blc.hvml
enum_lam_smart.hvml
enum_nat.hvml
enum_primes.hs
enum_primes.hvml
feat_affine_ctx.hvml
feat_cmul.hvml
feat_hoas.hvml
feat_mut_ref.hvml
fuse_inc.hvm1
fuse_inc.hvml
fuse_mul.hvml
fuse_rot.hvml
main.hvml
the hvml files can be evaluated with hvml, as follows:
to run on interpreted mode:
$ cabal run -v0 hvml --project-dir=/Users/v/vic/dev/HVM3 -- run file_name.hvml -C -s
to run on compiled mode:
$ hvml run file_name.hvml -c -C -s
the output of running a hvml file is formatted like:
```
λa λb a
WORK: 117440614 interactions
TIME: 1.0978210 seconds
SIZE: 335544641 nodes
PERF: 106.976 MIPS
```
that is, the first line is the result, and then we have some stats.
our goal is to create a _test_.js script that will test the hvml implementation
by running these files. to do so, we must extract the result (first line) and
the "time" from the output. to have a better time measure, we will run each
command 4 times, discard the first result, and average the other 3. we will
compare the result agains the expected result. if the result is wrong, we will
console.log an error message. if the time decreased by >5%, we will also log an
error message. the timings of the last run are saved on _perf_.json (create if
it doesn't exist). finally, note that, before running a command, we must
*adjust* the file, by replacing its '@main' line (i.e., a line starting with
the "@main" string) by a given line, and then saving it as _test_.hvml (i.e.,
do not overwrite the original file). the line we'll replace it with is given
in the table below, which also includes the expected result ("norm").
bench_cnots:
- intr:
- main: @main = (@P20 @not @true)
- norm: λa λb a
- comp:
- main: @main = (@P24 @not @true)
- norm: λa λb a
bench_count:
- intr:
- main: @main = @count(2_000_000 0)
- norm: 4000000
- comp:
- main: @main = @count(2_000_000_000 0)
- norm: 4000000000
bench_sum_range:
- intr:
- main: @main = @sum(@range(1_000_000 #Nil) 0)
- norm: 3992170112
- comp:
- main: @main = @sum(@range(300_000_000 #Nil) 0)
- norm: 1783293664
enum_coc_smart:
- intr:
- norm: "λλ(0 λλ((1 3) 2))"
- comp:
- norm: "λλ(0 λλ((1 3) 2))"
enum_lam_naive_blc:
- comp:
- norm: "λλ(1 λλ((2 0) 1))"
enum_lam_smart:
- intr:
- norm: "λ(0 λλλ((0 1) 2))"
- comp:
- norm: "λ(0 λλλ((0 1) 2))"
enum_nat:
- intr:
- main: @main = @if(@eq(@mul(@X @nat(20)) @nat(12000)) @u32(@X) *)
- norm: 600
- comp:
- main: @main = @if(@eq(@mul(@X @nat(20)) @nat(20000)) @u32(@X) *)
- norm: 1000
enum_primes:
- intr:
- main: @main = @if(@eq(@mul(@X_B @Y_B) @P_B) λt(t @u32(@X_B) @u32(@Y_B)) *)
- norm: λa ((a 25997) 27299)
- comp:
- main: @main = @if(@eq(@mul(@X_C @Y_C) @P_C) λt(t @u32(@X_C) @u32(@Y_C)) *)
- norm: λa ((a 25997) 27299)
feat_affine_ctx:
- intr:
- norm: 1
- comp:
- norm: 1
feat_cmul:
- intr:
- norm: λa λb (a (a (a (a b))))
- comp:
- norm: λa λb (a (a (a (a b))))
feat_hoas:
- intr:
- norm: "λx λy (x (x (x (x y))))"
- comp:
- norm: "λx λy (x (x (x (x y))))"
feat_mut_ref:
- intr:
- norm: 2
- comp:
- norm: 2
fuse_inc:
- intr:
- norm: 1234567
- comp:
- norm: 1234567
fuse_mul:
- intr:
- main: @main = @mul(12345 12345)
- norm: 152399025
- comp:
- main: @main = @mul(23232 32323)
- norm: 750927936
fuse_rot:
- intr:
- main: @main = (@read(@S) @sqr(12345 (@add(@S) @KA) @KB))
- norm: 209865
- comp:
- main: @main = (@read(@S) @sqr(54321 (@add(@S) @KA) @KB))
- norm: 923457
note that the table above has two versions: intr and comp. this refers to
interpreted and compiled mode, respectively. each file is tested separately
for each mode. so, for example, on fuse_rot, we'll first replace main by
@main = (@read(@S) @sqr(12345 (@add(@S) @KA) @KB))
then, save to _test_.hvml, and run it with interpreted mode. we'll then check if
the result (i.e., the first line) is exactly 209865 (on the first run). if it
isn't, we'll show a red error on the screen. then, we'll run 3 more times, to
average the time. if the perf drops by >5%, we'll show a yellow warning on the
screen. note that, if the total run time is < 0.05s, we'll skip perf warnings
(as this is considered a non-benchmark-relevant test). then, we will do this
again, but now using the compiled mode. we'll do this for each test case on the
table above. note that some cases don't have a 'main' function. that means you
don't need to replace main by anything, just copy to _test_.hvml. note that
some programs don't have a mode (like intr) in the table above. when that's the
case, you shouldn't run that mode on that program.
note that some normal forms include the quote (") character. that's because
these programs return strings. you must include them in your check.
create now the _test_.js file, which will be responsible for running these
tests, reporting errors and saving _perf_.json. keep it simple and well
documented, with rich comments documenting the purpose of the file, capturing
and explaining all the instructions given in this prompt.
@VictorTaelin
Copy link
Author

VictorTaelin commented Feb 24, 2025

generated output by Grok3:

//**
 * HVML Test Script
 *
 * This script tests the HVML implementation by running a series of .hvml files
 * in both interpreted and compiled modes, as specified in the test table.
 * For each test case:
 * - If a 'main' line is provided, it replaces the '@main' line in the original file
 *   and saves it as '_test_.hvml'. Otherwise, it copies the original file to '_test_.hvml'.
 * - Runs the test 4 times:
 *   - First run: Extracts the result (first line of output) and checks it against
 *     the expected 'norm'. Logs an error if they don't match.
 *   - Next 3 runs: Measures the execution time (from the 'TIME:' line in output).
 * - Averages the times from the last 3 runs.
 * - Compares the average time with the previous run's time stored in '_perf_.json'.
 *   If the new time is >5% slower and the total run time is >= 0.05 seconds, logs a warning.
 * - Updates '_perf_.json' with the new average times after all tests.
 *
 * Key Notes:
 * - Only runs modes ('intr' or 'comp') specified in the test table for each file.
 * - Some tests lack a 'main' line, meaning no replacement is needed—just copy the file.
 * - Uses Node.js 'fs' for file operations and 'child_process.execSync' to run commands.
 * - Assumes the script runs in a directory containing the .hvml files.
 * - Commands use a hardcoded project directory (/Users/v/vic/dev/HVM3) for interpreted mode.
 */

const fs = require('fs');
const { execSync } = require('child_process');

// ### Test Specifications
// Array of test objects, each defining a file and its test cases for 'intr' (interpreted)
// and/or 'comp' (compiled) modes. Each mode object has an expected 'norm' and an optional 'main'.
const tests = [
  {
    file: 'bench_cnots.hvml',
    intr: { main: '@main = (@P20 @not @true)', norm: 'λa λb a' },
    comp: { main: '@main = (@P24 @not @true)', norm: 'λa λb a' }
  },
  {
    file: 'bench_count.hvml',
    intr: { main: '@main = @count(2_000_000 0)', norm: '4000000' },
    comp: { main: '@main = @count(2_000_000_000 0)', norm: '4000000000' }
  },
  {
    file: 'bench_sum_range.hvml',
    intr: { main: '@main = @sum(@range(1_000_000 #Nil) 0)', norm: '1783293664' },
    comp: { main: '@main = @sum(@range(300_000_000 #Nil) 0)', norm: '3992170112' }
  },
  {
    file: 'enum_coc_smart.hvml',
    intr: { norm: '"λλ(0 λλ((1 3) 2))"' },
    comp: { norm: '"λλ(0 λλ((1 3) 2))"' }
  },
  {
    file: 'enum_lam_naive_blc.hvml',
    comp: { norm: '"λλ(1 λλ((2 0) 1))"' }
  },
  {
    file: 'enum_lam_smart.hvml',
    intr: { norm: '"λ(0 λλλ((0 1) 2))"' },
    comp: { norm: '"λ(0 λλλ((0 1) 2))"' }
  },
  {
    file: 'enum_nat.hvml',
    intr: { main: '@main = @if(@eq(@mul(@X @nat(20)) @nat(12000)) @u32(@X) *)', norm: '600' },
    comp: { main: '@main = @if(@eq(@mul(@X @nat(20)) @nat(20000)) @u32(@X) *)', norm: '1000' }
  },
  {
    file: 'enum_primes.hvml',
    intr: { main: '@main = @if(@eq(@mul(@X_B @Y_B) @P_B) λt(t @u32(@X_B) @u32(@Y_B)) *)', norm: 'λa ((a 853) 947)' },
    comp: { main: '@main = @if(@eq(@mul(@X_C @Y_C) @P_C) λt(t @u32(@X_C) @u32(@Y_C)) *)', norm: 'λa ((a 25997) 27299)' }
  },
  {
    file: 'feat_affine_ctx.hvml',
    intr: { norm: '1' },
    comp: { norm: '1' }
  },
  {
    file: 'feat_cmul.hvml',
    intr: { norm: 'λa λb (a (a (a (a b))))' },
    comp: { norm: 'λa λb (a (a (a (a b))))' }
  },
  {
    file: 'feat_hoas.hvml',
    intr: { norm: '"λx λy (x (x (x (x y))))"' },
    comp: { norm: '"λx λy (x (x (x (x y))))"' }
  },
  {
    file: 'feat_mut_ref.hvml',
    intr: { norm: '2' },
    comp: { norm: '2' }
  },
  {
    file: 'fuse_inc.hvml',
    intr: { norm: '1234567' },
    comp: { norm: '1234567' }
  },
  {
    file: 'fuse_mul.hvml',
    intr: { main: '@main = @mul(12345 12345)', norm: '152399025' },
    comp: { main: '@main = @mul(23232 32323)', norm: '750927936' }
  },
  {
    file: 'fuse_rot.hvml',
    intr: { main: '@main = (@read(@S) @sqr(12345 (@add(@S) @KA) @KB))', norm: '209865' },
    comp: { main: '@main = (@read(@S) @sqr(54321 (@add(@S) @KA) @KB))', norm: '923457' }
  }
];

// ### Load Previous Performance Data
// Load '_perf_.json' if it exists, otherwise initialize an empty object.
let perfData = {};
try {
  perfData = JSON.parse(fs.readFileSync('_perf_.json', 'utf8'));
} catch (e) {
  console.log('[INFO] _perf_.json not found, initializing as empty.');
}

// ### Counters for Errors and Warnings
// Track the number of errors (result mismatches) and warnings (perf regressions).
let errorCount = 0;
let warningCount = 0;

// ### Main Test Loop
for (const test of tests) {
  const file = test.file;
  // Check each mode: 'intr' (interpreted) and 'comp' (compiled).
  for (const mode of ['intr', 'comp']) {
    if (test[mode]) { // Only run if the mode is specified in the test.
      console.log(`Running ${file} in ${mode} mode...`);
      const { main, norm } = test[mode];

      // Prepare the test file by adjusting '@main' or copying as needed.
      prepareTestFile(file, main);

      let times = [];
      for (let i = 0; i < 4; i++) {
        const output = runTest(mode);
        const { result, time } = parseOutput(output);

        console.log("- time:", time.toFixed(7), "| norm: " + result);

        if (i === 0) {
          // First run: Check the result against the expected norm.
          if (result !== norm) {
            console.log(`[ERROR] For ${file} in ${mode} mode, expected "${norm}", but got "${result}"`);
            errorCount++;
          }
        } else {
          // Next 3 runs: Collect execution times.
          times.push(time);
        }
      }

      // Calculate average time from the last 3 runs.
      const averageTime = times.reduce((a, b) => a + b, 0) / times.length;
      const key = `${file}_${mode}`; // Unique key for this test and mode.
      const previousTime = perfData[key];

      // Check for performance regression (>5% slower) if previous data exists and time >= 0.05s.
      if (previousTime && averageTime > previousTime * 1.05 && averageTime >= 0.05) {
        console.log(`[WARNING] Performance regression for ${file} in ${mode} mode: ` +
                    `previous ${previousTime.toFixed(6)}s, now ${averageTime.toFixed(6)}s`);
        warningCount++;
      }

      // Update performance data with the new average time.
      perfData[key] = averageTime;
    }
  }
}

// ### Save Updated Performance Data
// Write the new performance data to '_perf_.json' with readable formatting.
fs.writeFileSync('_perf_.json', JSON.stringify(perfData, null, 2), 'utf8');

// ### Summary
// Log the total number of errors and warnings.
console.log(`All tests completed with ${errorCount} errors and ${warningCount} warnings.`);

// ### Helper Functions

/**
 * Prepares the test file by replacing the '@main' line or copying the original file.
 * @param {string} originalFile - The original .hvml file path.
 * @param {string|undefined} mainLine - The new '@main' line to use, if provided.
 */
function prepareTestFile(originalFile, mainLine) {
  if (mainLine) {
    // Read the original file and split into lines.
    const lines = fs.readFileSync(originalFile, 'utf8').split('\n');
    // Find the line starting with '@main'.
    const index = lines.findIndex(line => line.startsWith('@main'));
    if (index !== -1) {
      lines[index] = mainLine; // Replace the '@main' line.
    } else {
      // If no '@main' line exists, append the new one and warn.
      console.log(`[WARNING] No @main line found in ${originalFile}, adding at the end.`);
      lines.push(mainLine);
    }
    // Save the modified content to '_test_.hvml'.
    fs.writeFileSync('_test_.hvml', lines.join('\n'), 'utf8');
  } else {
    // If no mainLine is provided, copy the original file as is.
    fs.copyFileSync(originalFile, '_test_.hvml');
  }
}

/**
 * Runs the HVML test in the specified mode and returns the output.
 * @param {string} mode - 'intr' for interpreted, 'comp' for compiled.
 * @returns {string} - The command output.
 */
function runTest(mode) {
  let command;
  if (mode === 'intr') {
    command = 'cabal run -v0 hvml --project-dir=/Users/v/vic/dev/HVM3 -- run _test_.hvml -C -s';
  } else if (mode === 'comp') {
    command = 'hvml run _test_.hvml -c -C -s';
  } else {
    throw new Error(`Unknown mode: ${mode}`);
  }
  // Execute the command and capture the output as a UTF-8 string.
  return execSync(command, { encoding: 'utf8' });
}

/**
 * Parses the command output to extract the result and time.
 * @param {string} output - The output from running the HVML command.
 * @returns {{result: string, time: number}} - The result (first line) and time in seconds.
 */
function parseOutput(output) {
  const lines = output.split('\n');
  const result = lines[0].trim(); // First line is the result.
  const timeLine = lines.find(line => line.startsWith('TIME:'));
  if (!timeLine) {
    throw new Error('TIME line not found in output');
  }
  const timeStr = timeLine.split(' ')[1]; // Extract time value after 'TIME:'.
  const time = parseFloat(timeStr);
  if (isNaN(time)) {
    throw new Error(`Failed to parse time: ${timeStr}`);
  }
  return { result, time };
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment