They basically all suggest that apparent improvements to the state of the art in ML and related fields are often not real, or at least the result of factors other than what the authors claim.
The state of sparsity in deep neural networks
What is the state of neural network pruning?
On the State of the Art of Evaluation in Neural Language Models
Do Transformer Modifications Transfer Across Implementations and Applications?