I was recently asked how to check the entropy of a given section in YARA, and because the person who asked is clearly looking to learn how to fish instead of just being given fish I went into some detail on the explanation. With his permission I am sharing my response here.
It's a combination of a number of things:
math.in_range(test, lower, upper):
Given a test value, check to see if it is in range of the lower and upper bounds. This is an inclusive test.
math.entropy(offset, length):
Given an offset and length calculate the entropy.
pe.section_index(string):
Given a string, return the index into the sections array where that section lives.
The basic goal:
-
Use section_index to get the section object you want from the sections array.
-
Use the
raw_data_offset
andraw_data_length
values to calculate the entropy of that section. -
Use
math.in_range()
to check that the entropy is in a specific range.
And here's the rule:
rule text_entropy {
condition:
math.in_range(math.entropy(pe.sections[pe.section_index(".text")].raw_data_offset, pe.sections[pe.section_index(".text")].raw_data_size), 4.0, 5.0)
}
It's worth noting that section_index can also take an integer, which is interpreted as the offset into the file (or RVA if scanning memory) and will return the index into the sections array where that offset lives. This allows you to do things like:
rule string_in_rdata {
strings:
$a = "FreeEnvironmentStringsW"
condition:
for any i in (0..#a):
(pe.section_index(@a[i]) == pe.section_index(".rdata"))
}
It is a slightly contrived example but illustrates the point. And I think you can start to see the value in combining this with the previous rule to calculate entropy of the section where a given string matches. You can also use the @a[N] construct to say "the Nth" place where this string matched if you know specific things like that.
The one thing you can currently NOT do is say "is this offset not in ANY section". What happens is that if section_index() can not find the right section for the given argument it returns UNDEFINED (which is a slight misnomer as it is a defined, constant value used internally by YARA). This means that pe.section_index(0) returns UNDEFINED (some very large unsigned integer) and when used as the index into the sections array it short-circuits the conditional to false (effectively)
Ideally that would be solved by saying:
is_undefined(pe.section_index(0))
I just need to write the is_undefined part, but I haven't had a need to do it yet. If you have a use case for it, let me know so I can get it in the next release.
-- WXS