Things I know but need a checklist to ensure I address them systematically.
- Dataset is balanced - dev, train and test. Do you have approx. balanced classes in each?
- Is each dataset distinct? Have you checked for duplicates within and across datasets?
- Is the data shuffled?
- Spot check the data. Are the classifications consistent and in line with the objective?
- Score the model before training. Is its accuracy close to random?
- Train the simplest available model (no pretrained vectors) on a small subset of data (overfit). Does its loss improve? Does its accuracy improve to something better than random?