Audio over ethernet is a large space. Folk have tried to skin this CAT (groan) in many ways and on many network layers including layer-1 (custom MAC), layer-2 (raw ethernet frames), layer-3/4 (UDP/RTP/etc.).
For my purposes I'm mainly interested in a layer 2/3 implementation that can run on >= 400 MHz bare-metal with a cycle or two left over for happy-joy-joy DSP fun-times.
Smuggling audio frames over the Internet is not a main focus for me because there's already a most excellent solution to that problem which is easy to implement on even smollish microcontrollers.
So, of the ~50 or so implementations out there, for the purpose of this discussion, we can probably reduce it down to the two (and maaaaybe a 1/2) protocols that have gotten some commercial traction over the years.