-
-
Save justinabrahms/c35bfc16338108c100a6 to your computer and use it in GitHub Desktop.
"""Goal here is to obfuscate GPG blocks using markov chains. | |
Steps: | |
1. Train markov chain with some news articles on a similar topic. | |
2. Take in a GPG armored ascii block. | |
3. Use that input to determine which branch of the markov chain to walk. | |
4. The result would be a new article similar to the inputs. | |
5. The user could then take the same program, run it against the same | |
inputs, and output the GPG block. | |
For #4, we need to send the recieving user the inputs. Since this is an | |
article though, we could just include them as links in the text. | |
Inspiration: | |
I was reading an article on theintercept.com about blaming encrypting | |
for terror attacks. I think that's an absurd premise. The author, Glenn | |
Greenwald, quoted a former FBI director about needing "viable key | |
management infrastructure" which sounds fucking terrifying. | |
I thought this would be an interesting mechanism to "hide" that you're | |
transmitting an encrypted message, in case it ever became such a thing | |
that you weren't legally allowed to send. | |
Original Link: https://theintercept.com/2015/11/15/exploiting-emotions-about-paris-to-blame-snowden-distract-from-actual-culprits-who-empowered-isis/ | |
""" |
Steganography makes me think of images. Given that you have one of 64 characters in an ascii armored message.. it seems reasonable that you could simply have an 7-bit color space. Or perhaps an animated gif with a full panel of color times the numbers of characters as panes.
I thought markov chains might be deterministic given the same input, but "research markov chains" were next on my list. :)
This is likely to be indistinguishable from random noise, though, no? The problem here, and the reason steganography applies, is that you're trying to hide seemingly random noise within something that simply appears to be unmodified. Take a look at Bacon's Cipher as it directly applies to "stegotext", according to the wiki page linked above.
That's neat. I could see making that w/ font tags, then maybe capturing it as an svg image or something.
I was less thinking of including it as including it in random noise, as much as obfuscating that you're sending crypto things.
The AB thing Bacon uses is neat. If you had a markov chain with 2+ possible stems from the original word, you could just deterministically pick them based on the expression of A or B. If you had < 2 possible stems, you skip that one and continue. Not sure if you'd hit a cyclical loop or not. shrug
I'm not entirely sure Markov chains fit in here, but it might be possible.
To your original point of storing the encoded message in the original article, what if you use spaces between words, or between sentences to encode the {A,B} from Bacon's Cipher? Now, we have 7 words per armor block character, and we'd have to encode some mechanism for "EOM", but, now we have perfectly readable text with some inconsistent use of the space bar. If it's encoded further in HTML, it displays perfectly fine to the viewer, and most would be none the wiser. Don't treat '\n' as space.
Algorithm becomes:
- Encrypt message and get ASCII armor block
- Ensure that the article you're sending is at least Len(ASCII armor block) * 7 words.
- Bacon encipher the ASCII block
- Explode the article text on spaces (spaces could even be contained in the HTML tags)
- Reassemble based on the Bacon Encipher output, inserting 1 space for A, 2 for B
- ???
- Profit
Idea: Instead of encoding an EOM, frame the message with a 4 "byte" byte count, Bacon ciphered, of course.
Someone should blog about this, with a basic implementation, though it's probably not novel.
Recording discussion on twitter.
I pointed out we could use
instead of 2 spaces. Andrew suggested we could use U+00A0 instead so it was less obvious.
First thought is that a Markov chain isn't the right approach, because by definition it lacks determinism, which is crucial to recovering the original ASCII armored message. But, maybe there's some steganography research that would help figure this out.