Once SparseTIR was accepted to ASPLOS 2023, we began constructing the sparsetir-artifact repository for artifact evaluation. Though we already have lots of benchmarking scripts, we found it still not a trivial job to put them together and evaluate in a unified manner. While preparing our artifact, we also found some problems with our profiler and bugs in existing implementations. We carefully addressed these issues and standardized the settings for all baselines. We are writing this post to document the challenges we faced and the lessons we learned from creating the artifact. We aim to provide insight that will benefit researchers and engineers working in related fields.
If you previously read our manuscript on ArXiv, you may have noticed that there are some discrepancies in the reported performance between SparseTIRv3 and our [camera-ready ver