Skip to content

Instantly share code, notes, and snippets.

View geowarin's full-sized avatar

Geoffroy Warin geowarin

View GitHub Profile
@geowarin
geowarin / LLM cheat.md
Created August 10, 2025 17:55
LLM cheat

AI Models and Benchmark Cheating: A Summary

Recent research reveals a troubling phenomenon in AI evaluation: leading language models have been "cheating" on benchmarks designed to test their capabilities. The paper "Benchmarking Benchmark Leakage in Large Language Models" ( BenBench) [1] demonstrates how benchmark dataset leakage has become increasingly prevalent, undermining fair comparisons between models. This occurs when models are trained on data that includes benchmark test sets, allowing them to memorize answers rather than demonstrate genuine understanding.

The researchers introduced a detection pipeline utilizing Perplexity and N-gram accuracy metrics to identify potential data leakage in models from major companies including Alibaba, Google, Meta, Microsoft, Mistral AI, and

I'll provide summaries of each paper/article based on the information you've provided and draw conclusions about current LLM limitations.

Paper Summaries

1. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

Summary: This study examines how using ChatGPT for essay writing tasks affects cognitive processes. The research suggests that relying on AI assistants for writing tasks may lead to an accumulation of "cognitive debt" - a degradation in critical thinking and writing skills over time. Users may become overly dependent on AI assistance, potentially weakening their ability to perform these cognitive tasks independently.

2. Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

@geowarin
geowarin / .editorconfig
Created February 5, 2022 01:50
C# editor config
# https://google.github.io/styleguide/csharp-style.html
[*.cs]
indent_style = space
indent_size = 2
tab_width = 4
insert_final_newline = true
max_line_length = 100
# Microsoft .NET properties
@geowarin
geowarin / MixamoImportTool.cs
Created May 5, 2020 12:29
Editor Window UIElements
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using UnityEditor;
using UnityEditor.Animations;
using UnityEditor.ShortcutManagement;
using UnityEditor.UIElements;
using UnityEngine;
using UnityEngine.Animations;
package com.geowarin.jooqgraphql
import java.util.*
class Node<T : Any>(
val data: T,
val dependencies: HashSet<Node<T>> = hashSetOf()
) {
override fun equals(other: Any?): Boolean = other is Node<*> && data == other.data
override fun hashCode(): Int = data.hashCode()
import com.sun.org.apache.xerces.internal.dom.DeferredAttrImpl
import org.intellij.lang.annotations.Language
import org.w3c.dom.Document
import org.w3c.dom.Node
import org.w3c.dom.NodeList
import java.io.File
import javax.xml.parsers.DocumentBuilderFactory
import javax.xml.xpath.XPath
import javax.xml.xpath.XPathConstants
import javax.xml.xpath.XPathFactory
package fds.zookeeper.toto;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.ZooKeeper;
import java.io.IOException;
import java.util.concurrent.CountDownLatch;
import java.util.function.Consumer;
@geowarin
geowarin / getHtml.kt
Created February 27, 2018 10:16
SSR with J2V8
fun getHtml(componentPath: String, modelJson: String, currentUrl: String): String {
val nodeJS = NodeJS.createNodeJS()
nodeJS.runtime.add("componentPath", componentPath)
nodeJS.runtime.add("modelJson", modelJson)
nodeJS.runtime.add("currentUrl", currentUrl)
try {
val vendors = nodeJS.require(File(bundleLocation, "vendor.js"))
val mainModule = nodeJS.require(File(bundleLocation, "pages.js"))
export declare function pluck<T, K1 extends keyof T>(p: T, property: K1): T[K1];
export declare function pluck<
T,
K1 extends keyof T,
K2 extends keyof T[K1]
>(o: T, property1: K1, property2: K2): T[K1][K2];
export declare function pluck<
T,
K1 extends keyof T,
K2 extends keyof T[K1],