Skip to content

Instantly share code, notes, and snippets.

@ncanceill
Last active December 18, 2015 06:59
Show Gist options
  • Save ncanceill/5743208 to your computer and use it in GitHub Desktop.
Save ncanceill/5743208 to your computer and use it in GitHub Desktop.
Current implementation of random sampling in my fork of frag_find, based on code from bulk_extractor.
/* [...] */
int main(int argc,char **argv)
{
/* [...] */
//RANDOM SAMPLING START
if(opt_sampling_params.size()>0) set_sampling_parameters(opt_sampling_params);
/* Create a list of blocks to sample */
srand(time(NULL));
std::set<uint64_t> blocks_to_sample;
uint64_t nblocks = imagefile.blocks;
int at_pass = 0;
if (sampling()) {
while(at_pass < sampling_passes) {
at_pass++;
while(blocks_to_sample.size() < nblocks * sampling_fraction * at_pass){
uint64_t blk_high = ((uint64_t)random()) << 32;
uint64_t blk_low = random();
uint64_t blk = (blk_high | blk_low) % nblocks;
blocks_to_sample.insert(blk); // will be added even if already present
}
}
}
//RANDOM SAMPLING END
for(uint64_t blocknumber=opt_start;blocknumber < opt_end && blocknumber < imagefile.blocks; blocknumber++){
/* [...] */
//RANDOM SAMPLING START
/* Limit search to the random samples */
if (sampling() && blocks_to_sample.find(blocknumber) == blocks_to_sample.end()) continue;
//RANDOM SAMPLING END
/* [...] */
}
/* [...] */
}
@ncanceill
Copy link
Author

Random sampling for frag_find

This gist presents the current implementation of random sampling in my fork of frag_find, based on code from bulk_extractor.

Usage

Through frag_find's option -R <frac>[:<pass>]: use <frac> as random sampling fraction (default is 1) and <pass> as number of passes (default is 1).

Integration

In frag_find.cpp, see commits 1 and 2.

Based on bulk_extractor.cpp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment