This project is called foobar. Its goal is to provide ...
This file contains is additional guidance for AI agents and other AI editors.
| import tarfile | |
| import time | |
| import urllib.request | |
| from collections import OrderedDict | |
| from pathlib import Path | |
| import numpy as np | |
| import scipy.io | |
| import scipy.sparse | |
| from sklearn.decomposition import TruncatedSVD |
| """ | |
| Test: Validate that cuml native estimators can be converted to ONNX | |
| via as_sklearn() -> skl2onnx -> onnxruntime. | |
| Unlike cuml.accel proxies (which skl2onnx recognizes directly), native cuml | |
| estimators must first be converted to sklearn via as_sklearn() before | |
| skl2onnx.convert_sklearn() will accept them. | |
| Run without cuml.accel: | |
| python test_onnx_as_sklearn.py |
Created: 2026-01-07 Last Updated: 2026-01-07
Scikit-learn's Array API support enables estimators and functions to work with arrays from different libraries (NumPy, CuPy, PyTorch) without modification. This allows computations to run on GPUs when using GPU-backed array libraries.
The implementation follows the Array API Standard, a specification that defines a common API for array manipulation libraries.
| #!/usr/bin/env python3 | |
| """ | |
| Ray + RandomForestClassifier with max_calls=1 | |
| Demonstrates the impact of max_calls=1 on Ray task execution when using | |
| scikit-learn's RandomForestClassifier. | |
| """ | |
| import time | |
| import ray | |
| from sklearn.datasets import make_classification |
| """ | |
| Benchmark: scikit-learn RandomForest vs LightGBM RandomForest | |
| Compares performance across: | |
| - Number of samples (1K, 10K, 100K, 500K) | |
| - Number of features (10, 50, 200) | |
| - Feature types (numerical, categorical, mixed) | |
| - Number of classes (2, 5, 10) | |
| Includes cases optimized for LightGBM's strengths: |
| name: tabareana-20251202 | |
| channels: | |
| - conda-forge | |
| dependencies: | |
| - _libgcc_mutex=0.1=conda_forge | |
| - _openmp_mutex=4.5=2_gnu | |
| - bzip2=1.0.8=hda65f42_8 | |
| - ca-certificates=2025.11.12=hbd8a1cb_0 | |
| - cuda-cccl_linux-64=13.0.85=ha770c72_0 | |
| - cuda-cudart-dev_linux-64=13.0.96=h376f20c_0 |
| from __future__ import annotations | |
| import warnings | |
| warnings.simplefilter("error", FutureWarning) | |
| from pathlib import Path | |
| from typing import Any | |
| import pandas as pd | |
| from tabarena.benchmark.experiment import AGModelBagExperiment, ExperimentBatchRunner |
Purpose: This checklist is optimized for AI assistants (like Cursor) to perform automated PR reviews. It separates automatable checks from those requiring human judgment, provides specific patterns to detect, and includes commands to run.