Created
March 25, 2017 10:53
-
-
Save GreyCat/9dba530b0d2cb8ccec4e1d6e90a0b565 to your computer and use it in GitHub Desktop.
Convert C struct into Kaitai Struct spec (.ksy)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
require 'yaml' | |
TYPE_TO_KSY = { | |
'uint8_t' => 'u1', | |
'uint16_t' => 'u2', | |
'uint32_t' => 'u4', | |
'uint64_t' => 'u8', | |
'int8_t' => 's1', | |
'int16_t' => 's2', | |
'int32_t' => 's4', | |
'int64_t' => 's8', | |
'char' => 's1', | |
'short' => 's2', | |
'int' => 's4', | |
'uint' => 'u4', | |
'long' => 's8', | |
'float' => 'f4', | |
'double' => 'f8', | |
'WORD' => 'u2', | |
'DWORD' => 'u4', | |
'__s32' => 's4', | |
'__s64' => 's8', | |
} | |
def parse_body(body) | |
r = [] | |
body.gsub(/\s*([A-Za-z0-9_]+)\s+([A-Za-z0-9_]+)\s*;\s*(\/\*\s*(.*?)\s*\*\/)?/) { | |
type, name, comment, comment_body = $1, $2, $3, $4 | |
h = {'id' => name, 'type' => TYPE_TO_KSY[type] || type} | |
h['doc'] = comment_body unless comment_body.nil? or comment_body.empty? | |
r << h | |
} | |
r | |
end | |
r = {} | |
$stdin.read.gsub(/struct\s+([A-Za-z0-9_]+)?\s*\{(.*?)\}\s*(.*?);/m) { | |
tag, body, name = $1, $2, $3 | |
name = name.split(/\s*,\s*/)[0] | |
name = tag if name.nil? or name.empty? | |
r[name] = parse_body(body) | |
} | |
puts({'types' => r}.to_yaml) |
Sure, if you're interested in this, we can make it a full-fledged project. I would start with some solid tests first then — right now, it's mostly ad-hoc tested with some arbitrary inputs I've got and that's it.
For arrays, I guess we can modify the line parsing regexp to accommodate possible [n]
index.
I agree that all projects should start with solid tests. I'm unfortunaetely not skilled enough to set up tests though (especially not in Ruby) so I might not be very useful.
It's like literally 30-40-50 lines of code. If you want to take lead for this project and rewrite it in anything else (Python? JavaScript? Perl?) — by all means, feel free to, it should be pretty straightforward :)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Nice work!
Q1: Would you be interested in pull requests to add conversion for more data types?
Q2: How would I go about adding support for array members? Example
uint8_t __padding0[4];
The current version of this script seems to ignore all arrays.