MacOS USB MIDI Corruption
When sending modest sized System Exclusive MIDI messages to a USB MIDI device, CoreMIDI will sometimes corrupt the message by dropping some blocks of the message, while duplicating others.
At best, devices receive corrupted messages, detect them, and reject them. The user sees their operation fail, but has no recourse other than to try again.
At worst, devices use the corrupted message data, saving, unknown to the user, invalid sounds, patches, etc... (For example: Swapping two blocks of samples the middle of a transmitted sound may not cause the sample to look invalid to the device... but when played later will clearly be corrupted.)
This corruption has been observed no matter the content of the MIDI System Exclusive message. Therefore this impacts all MIDI software & MIDI devices that handles such messages.
This has been observed on MacOS 13, MacOS 14. I haven't got a trace on MacOS 15 yet.
Set up:
- Arrange to send a number of larger SysEx packets to a device. 100 messages each 60k long is good enough - though the issue happens still with smaller sizes, but you'll need more to trigger it.
- Set up MIDI Monitor to spy on the USB MIDI device you're going to send to.
- Set up a way to capture the actual data sent to the MIDI Device, such as:
- Use a USB capture device to capture the packets
- Use WireShark to capture the USB interface (XHCn) on the host
- Use a USB MIDI device that will exactly capture and save received SysEx
Run:
- Start capturing on both MIDI Monitor & USB
- Send the messages
- Note that the messages captured in MIDI Monitor are exactly what was sent
- Extract the captured USB MIDI data
- Note that the messages sent over USB do not match
(N.B: You'll have to wait a long time for MIDI Monitor to catch up.. but it will. There is a bug in the code that resizes a data buffer once per byte! Bug filed with them...)
I'm the author of a program that saves and restores settings data from USB MIDI connected synthesizers. Such programs are common, and known as Patch Librarians. There are several for Mac OS, including a popular generic one known as "SysEx Librarian".
Working on support of a new synthesizer, which uses 76k byte SysEx messages for settings, I observed that sometimes a packet of settings would be sent but simply not stored by the device. Resending the same data a 2nd or 3rd time would often succeed.
Using the program "MIDI Monitor" to observe what MIDI data was being sent to the device, I could confirm that the data I was sending was valid, and identical between times it worked and times if failed.
I suspected the device, but the device gives no useful diagnostics. So I used a USB capture device to see the interaction with the device. (Same can be done using Wireshark.) I extracted the data from the captures to verify it was being sent correctly... and discovered that MIDI stream being sent to the device was corrupt.
To verify, I did the same test, capturing the MIDI with MIDI Monitor at the same time. The MIDI captured at the MIDI layer was correct, but the MIDI in the USB packets differed. This pointed to a bug in CoreMIDI's translation to USB MIDI.
In CoreMIDI when a MIDI stream is directed at a USB MIDI device, the MIDI message stream must get re-encoded into USB-MIDI Event Packet (UMEP) format. Blocks of 512 bytes (128 UMEP 4-byte messages) are then sent to the USB endpoint.
UMEP is similar to, but not the same as, MIDI 2.0's Universal MIDI Packet format. See Universal Serial Bus Device Class Definition for MIDI Devices.
All MIDI messages fit in a single 4-byte UMEP message except System Exclusive messages ("SysEx" for short). These messages are variable length, and are encoded, 3 bytes per 4 byte UMEP message.(*) Thus, when sending a SysEx of more than a few hundred bytes, each 384 bytes of SysEx is put into 128 4-byte UMEP messages and sent as a single 512 byte USB DATA packet.
By sending SysEx messages with a known, fixed pattern of bytes, and then capturing the USB traffic to the end device, I could compare the sequence of bytes sent, to the sequence of bytes in the SysEx message encoded in the UMEP messages on the wire.
I found that groups of 384 bytes of my original message were sometimes skipped, and sometimes duplicated. Furthermore, these blocks always aligned with the 512 byte USB DATA packets. That is, a whole 512 byte USB packet's worth of data is sometimes skipped, and sometimes duplicated.
I can tell that the blocks are not dropped or duplicated at the USB level as the strict alteration of DATA0 and DATA1 frames holds. When a block is duplicated, it is sent as first one one type of DATA frame, then the other. This implies that the USB system is being told to send two identical blocks.
The correct number of blocks is always written, duplicating the last block if needed multiple times. This causes an invalid MIDI stream: When this block is decoded from UMEP back to a MIDI stream on the device, it looks like a SysEx message without a starting 0xF0 byte, but with a trailing 0xF7 byte. These malformed messages were the first clue that the USB data sent by the Mac was corrupt.
The pattern of dropped and duplicated blocks is generally only off by one block. For example: the 23rd block will be skipped, followed by the 24th block twice. Or similarly the 40th block will be sent twice, then the 41st block dropped. Often a series errors happens in a row, and then it "gets back on track" and things are fine... until the end where the last block will be duplicated if needed to make the block count correct.
To this grizzled software engineer, this is strongly implicates a ring buffer of 512 byte blocks where one thread is encoding the MIDI stream into UMEP a block at a time, and another is handing them off to the USB layer.... and there is bug where the read/write pointers into this ring buffer gets off by a block. But that's just a hunch.
This is parsed out of a USB capture. The Mac is sending 60k SysEx packets with a fixed pattern of non-repeating bytes.
00027110-0003A994 sys ex [60000] F0 60 60 20 20 00 20 20 01 20...60 60 F7
0003A998-0004E01C sys ex [59616] F0 60 60 20 20 00 20 20 01 20...60 60 F7
@ 0005fd expected 20-23-7e, found 20-24-7e skip ahead 1 block
@ 000a7d expected 20-27-7e, found 20-26-7e duplicate prior 1 block
@ 000bfd expected 20-27-7e, found 20-28-7e skip ahead 1 block
@ 001dfd expected 20-34-7e, found 20-33-7e duplicate prior 1 block
@ 001f7d expected 20-34-7e, found 20-35-7e skip ahead 1 block
@ 0023fd expected 20-38-7e, found 20-37-7e duplicate prior 1 block
@ 00257d expected 20-38-7e, found 20-39-7e skip ahead 1 block
@ 002e7d expected 20-3f-7e, found 20-3e-7e duplicate prior 1 block
@ 002ffd expected 20-3f-7e, found 20-40-7e skip ahead 1 block
@ 00317d expected 20-41-7e, found 20-40-7e duplicate prior 1 block
@ 0032fd expected 20-41-7e, found 20-42-7e skip ahead 1 block
@ 00347d expected 20-43-7e, found 20-42-7e duplicate prior 1 block
@ 0035fd expected 20-43-7e, found 20-44-7e skip ahead 1 block
@ 0038fd expected 20-46-7e, found 20-45-7e duplicate prior 1 block
@ 003a7d expected 20-46-7e, found 20-47-7e skip ahead 1 block
@ 00557d expected 20-59-7e, found 20-58-7e duplicate prior 1 block
@ 0056fd expected 20-59-7e, found 20-5a-7e skip ahead 1 block
0004E020-0004E21C sys ex [ 382] 1F 21 5B 20 21 5B 21 21 5B 22...60 60 F7
The first line is a successful 60k SysEx message.
The second line shows a 59616 byte SysEx message. It is 384 bytes short, and has a number of errors, outlined after it. In particular, at various points in the stream, it detects that the pattern has either jumped ahead, skipping a block of 384 bytes, or duplicated a prior block. From the detection points you can work out that these fall exactly on the 512 byte USB packet boundaries.
At the end, you can see that the whole was short one block. This manifests as as separate, malformed SysEx (no starting byte, but has the end byte), which is really a repeat of the last block of the prior message.
The UMEP encoding for SysEx messages makes use of four different message types, One for every three bytes of SysEx from the start, and then one of three message types for the last one, two or three bytes.
UMEP encoding also offers an "escape" message type that encodes one byte of a MIDI stream. The spec offers little guidance other than this message is to be used when an application prefers to not to parse a MIDI stream, just transfer it.
CoreMIDI will, at very regular points in a SysEx message, encode just one
byte using this "escape" message, then continue using the SysEx three byte
UMEP encoding. For example, the 7 bytes 10 11 12 13 14 15 16 17
in the
middle of a SysEx might get encoded as these three UMEP messages
04 11 12 13 -- SysEx start or continue, three bytes
0F 14 00 00 -- Single "unparsed" byte
04 15 16 17 -- SysEx start or continue, three bytes
This usage isn't discussed in the spec., and other operating systems do not do this. It appears that devices do handle it correctly, however.
CoreMIDI chooses to do this at exactly these offsets:
0x002ef
0x08000
0x0FFFF
0x10000
None of this matters too much, except that it occasionally throws off the relationship between 384 bytes of SysEx and 512 bytes of UMEP encoded USB data, and it took me a while to figure out what was going on.
USB at the delivery to the device's endpoint has only two integrity checks:
- Each DATA block has a 16-bit CRC.
- Data transmission alternates between DATA0 and DATA1 packets These checks can ensure that a single packet isn't directly repeated or dropped, but cannot otherwise vouch for integrity beyond that.
In the traces we see the strict alternation of DATA0 and DATA1 packets. That is, when we see a block of the MIDI message repeated, we see it first with one DATA packet, then with the other.
It is unfortunate that neither MIDI, nor USB MIDI encapsulation (UMEP) has any integrity checks whatsoever. Neither can even check that a System Exclusive message is the correct length.
Many MIDI device manufacturers have designed integrity checks into their System Exclusive message formats, generally in the form of CRC and/or length checks. These vary widely in effectiveness. (For example, simple checksums will not notice swapping of blocks.)
Three tools were developed during my sleuthing, and made available:
BigFoot is a command line Mac OS program for generating SysEx packets with a fixed non-repeating data pattern. The packets use a reserved SysEx identifier, so these can be safely sent to any USB MIDI device, and it will just ignore. them.
Typical usage:
./bigfoot -c 60000 -p -d "Electron Digitakt II"
$ ./bigfoot -?
./bigfoot [-l|-x] [-s|-p] [-n count] [-d destination]
-l list destinations (default)
-x dump sysex to stdout
-s send via MIDISysexSendRequest (default)
-p send via MIDIPacketList (like Chromium)
-n size of the sysex to send
-d destination to send to
N.B.: The code has two ways of sending large SysEx messages:
-s
makes use ofMIDISysexSendRequest
- but unfortunately, this API is too slow to use for any real application.-p
uses repeated calls toMIDISend
with aMIDIPacketList
. This is the way most applications send MIDI including SysEx. The code here modeled on Chromium's WebMIDI implementation.
If you have a .pcap
file you can extract the data of the transfer using
tshark
:
capturefile=traces/bigfoot-100x49k.pcapng
datafile=extracted/bigfoot-100x49k.data
device=20.18.1
tshark -r $capturefile \
-Y 'usb.src == "'$device'"' \
-T fields -e usb.capdata \
| xxd -r -p > $datafile
Note that this works for a WireShark capture on the host itself using the
XHCnn
devices. When doing so, note that usb.src
and usb.dst
are swapped
on MacOS, hence usb.src
above. If doing something similar on Linux,
change that to usb.dst
.
The USB packet data, which is a series of UMEP messages, can be analyzed to see what's in it with this python program:
python3 umep-decode.py extracted/bigfoot-100x49k.data
If the data contains BigFoot SysEx packets, it will further check their integrity and report issues.
Note: The analysis of errors in BigFoot packets isn't perfect, as CoreMIDI's encoding quirks with UMEP (see above), will throw this code off if a dropped block involves one of the "single byte" cases. In such cases this code will report a long string of errors, but the cases I hand looked at, they were still all just single block drops or reapts.