Tuya's IR blasters, like the ZS08, have the ability to both learn and blast generic IR codes. These IR codes are given to the user as an opaque string, like this:
A/IEiwFAAwbJAfIE8gSLIAUBiwFAC+ADAwuLAfIE8gSLAckBRx9AB0ADBskB8gTyBIsgBQGLAUALA4sB8gRAB8ADBfIEiwHJAeARLwHJAeAFAwHyBOC5LwGLAeA97wOLAfIE4RcfBYsB8gTyBEAFAYsB4AcrCYsB8gTyBIsByQHgPY8DyQHyBOAHAwHyBEAX4BVfBIsB8gTJoAMF8gSLAckB4BUvAckB4AEDBfIEiwHJAQ==
Not much is known about the format of these IR code strings, which makes it difficult to use codes obtained through other means (such as a manual implementation of the IR protocol for a particular device, or public Internet code tables) with these blasters, as well as to use codes learnt through these blasters with other brands of blasters and study their contents.
So far I've only been able to find one person who dug into this before me, who was able to understand it enough to create their own codes to blast, but not enough to understand codes learnt by the device.
This document attempts to fully document the format and also provides a (hopefully) working Python implementation.
There is no standard for IR codes, so appliances use different methods to encode the data into an IR signal, often called "IR protocols". A popular one, which could be considered an unofficial standard, is the NEC protocol. NEC specifies a way to encode 16 bits as a series of pulses of modulated IR light, but it's just one protocol.
Tuya's IR blasters are meant to be generic and work with just about any protocol. To do that, they work at a lower level and record the IR signal directly instead of detecting a particular protocol and decoding the bits. In particular, the blaster records a binary signal like this one:
+------+ +----------+ +-+
| | | | | |
--+ +-----+ +--+ +---
Such a signal can be represented by noting the times at which the signal flips from low to high and viceversa. It is better to record the differences of these times as they will be smaller numbers. For example, the above signal is represented as:
[7, 6, 11, 3, 2]
Meaning, the signal stays high for 7 units of time, then low for 6 units of time, then high for 11 units, and so on. The first time is always for a high state, which means even times (2nd, 4th, 6th...) are always low periods while odd times (1st, 3rd, 5th...) are always high periods.
The blaster takes these numbers (in units of microseconds) and encodes each of them as a little-endian 16-bit integer, resulting in the following 10 bytes:
07 00 06 00 0B 00 03 00 02 00
Because we're recording a signal rather than high-level protocol data, this results in very long messages in real life. So, the blaster compresses these bytes using a weird algorithm (see below), and then encodes the resulting bytes using base64 so the user can copy/paste the code easily.
Update: Turns out this is FastLZ compression. No need to read this section, you can go to their website instead.
I was unable to find a public algorithm that matched this, so I'm
assuming it's a custom lossless compression algorithm that a
random Tuya employee hacked to make my life more complicated.
Jokes aside it seems to be doing a very poor job, and if I were
them I would've just used Huffman coding or something.
Anyway, the algorithm is LZ77-based, with a fixed 8kB window. The stream contains a series of blocks. Each block begins with a "header byte", and the 3 MSBs of this byte determine the type of block:
-
If the 3 bits are zero, then this is a literal block and the other 5 bits specify a length L minus one.
Upon encountering this block, the decoder consumes the next L bytes from the stream and emits them as output.
+---------+-----------------------------+ |000LLLLLL| 1..32 bytes, depending on L | +---------+-----------------------------+
-
If the 3 bits have any other value, then this is a length-distance pair block; the 3 bits specify a length L minus 2, and the concatenation of the other 5 bits with the next byte specifies a distance D minus 1.
Upon encountering this block, the decoder copies L bytes from the previous output. It begins copying D bytes before the output cursor, so if D = 1, the first copied byte is the most recently emitted byte; if D = 2, the byte before that one, and so on.
As usual, it may happen that L > D, in which case the output repeats as necessary (for example if L = 5 and D = 2, and the 2 last emitted bytes are X and Y, the decoder would emit XYXYX).
+--------+--------+ |LLLDDDDD|DDDDDDDD| +--------+--------+
As a special case, if the 3 bits are one, then there's an extra byte preceding the distance byte, which specifies a value to be added to L:
+--------+--------+--------+ |111DDDDD|LLLLLLLL|DDDDDDDD| +--------+--------+--------+
now, if you're interested in decoding the signal to obtain the bits that are transmitted in it, I'm not sure which protocol that's using (I'm not very knowledgeable about IR protocols) but it can somehow be guessed... first, notice that in the middle of the signal there is a pretty big 40ms spacing (i.e. a low period). and it is exactly in the middle of the signal. it's common for remotes to transfer the code multiple times when you press a key (just in case one of the transmissions gets damaged by noise).
and indeed, if we check the signal before the 40ms we see that it matches the signal after the 40ms, so it's pretty safe to assume one is a retransmission of the other. so let's look only at the first retransmission. it starts with a 9ms pulse followed by a 4.5ms space. this is also common, the remote sends a big pulse to indicate that it's about to start transmitting a message.
after the initial pulse and spacing, note how all of the values are either ~1.7ms or ~0.55ms. in particular, the high parts are always 0.55ms while the low parts in between them are either 0.55ms or 1.7ms. so it's safe to say this protocol encodes bits as the spaces between 0.55ms pulses. if we treat 0.55ms spaces as a 0 and 1.7ms spaces as a 1, then your signals carry the following data: 01110110100010011101000000101111
I encourage you to do this decoding for all of the keys in your remote, and you may be able to guess what each bit means! you can also experiment by constructing your own messages, sending them to your appliance and seeing what that does. but be aware that some of the bits may be control bits to detect damaged transmissions, and changing some of the bits without recalculating the control bits as appropriate may make your appliance reject the message.