Last active
March 18, 2023 15:09
-
-
Save Rocketknight1/02a4a94765c848dbd07c07c74578f532 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Assume labels is a possibly multidimensional array of categories / token indices | |
_, label_counts = np.unique(labels, axis=None, return_counts=True) # Will flatten multidimensional arrays | |
# For multi-label classification you should normalize by the number of samples instead | |
label_frequencies = label_counts.astype(np.float) / np.sum(label_counts) | |
label_logprobs = np.log(label_frequencies) | |
# Now you just need to assign the values in label_logprobs to your bias vector! | |
# In TensorFlow, this will look something like: | |
model.classifier.bias.assign(label_logprobs) | |
# In PyTorch: | |
with torch.no_grad(): | |
model.classifier.bias.data[:] = label_logprobs | |
# The exact name of the weight to assign to will depend on the specific model head you're using |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment