Fern tysam-code

tysam-code / condensed-ml-tidbits.txt

Last active December 11, 2023 04:22

TODOTODOTODOTODO # workinprogress <3 :'))))

	# [IN-DEV currently]

	# Maintained/Initially created by Fern. Say hi to me and feel free to ask any questions as needed! <3 :'))))
	# If anything here is self-cited/has no citation, that means that it's a conclusion I arrived at over time, or in
	# deriving something from the basics, however, there may be work elaborating it in further detail (feel free to comment if there's an especially relevant link).

	# Misc
	- LayerNorm/RMSNorm might be acting as lateral inhibition, a paradigm attempted in many 2000's and surrounding ML papers (Fern, {relevant sources needed})
	- 'Soft' (pre-determined or pre-compiled) architectures in the weights of your network can greatly increase convergences times and/or generalization.
	- Downcasting dtypes to a lower bit depth in your dot products can be a 'free' efficiency improvement in some circumstances.