This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python3 | |
| """ | |
| TinyAdder: 9-parameter hand-crafted transformer for 10-digit addition. | |
| This is admittadly pushing the rules, the parameters are agressivly dedubplicated, arange is used | |
| it is however ones these 9 unique floats are ranged into the weights doing only standard | |
| transformer ops. | |
| Parameter counting (explicit scalars only): | |
| - Count scalar tensors created in __init__ (shared scalars count once). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python3 | |
| """ | |
| TinyAdder: 36-parameter hand-crafted transformer for 10-digit addition. | |
| Parameter counting: | |
| - Identity mappings (direct copy): 0 params | |
| - Broadcast (1 value to N outputs): 1 param | |
| - Distinct values: count each | |
| """ | |
| import torch |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python3 | |
| """ | |
| TinyAdder: A 36-parameter hand-crafted transformer for 10-digit addition. | |
| This model adds two 10-digit numbers with 100% accuracy using only 36 unique parameters. | |
| Architecture: | |
| - 2-layer transformer with ALiBi positional encoding | |
| - Layer 0: 5 attention heads (only 2 active), ReGLU FFN | |
| - Layer 1: 1 head uniform attention, V-shaped error FFN |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python3 | |
| """ | |
| TinyAdder: 36-parameter hand-crafted transformer for 10-digit addition. | |
| Parameter counting: | |
| - Identity mappings (direct copy): 0 params | |
| - Broadcast (1 value to N outputs): 1 param | |
| - Distinct values: count each | |
| """ | |
| import torch |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python3 | |
| """ | |
| TinyAdder: A hand-crafted 95-parameter transformer that performs 10-digit addition with ~100% accuracy. | |
| Only non-zero parameters are counted, so the nominal number of parameters is higher but most are zero. | |
| Architecture: | |
| - 2-layer transformer with ALiBi positional encoding | |
| - Layer 0: 5 attention heads | |
| - Layer 1: 1 head uniform attention |