Last active
October 20, 2020 14:38
-
-
Save yptheangel/164ec57d821aab6ffc6ad5efcee80638 to your computer and use it in GitHub Desktop.
Ways to optimize Deeplearning4j model, FP32 to FP16 Quantization
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Tip 1: Do not save the updater if you do not plan to continue training your model | |
// Set false for saveUpdater flag. | |
ModelSerializer.writeModel(model, modelFilename, false); | |
// Results: Model size drop almost 40%. | |
// Tip 2: Convert FP32 to FP16 floating point precision(Quantization), currently DL4J only supports float. {DOUBLE, FLOAT, HALF} | |
// Convert the parameters just as below: | |
model = model.convertDataType(DataType.HALF); | |
// Results: Model size drop by 50%, half of its original size. Accuracy did not drop. | |
// You can check data type by: | |
System.out.println(model.params().dataType()); |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment