Skip to content

Instantly share code, notes, and snippets.

@forensicmike
Last active November 12, 2018 14:42
Show Gist options
  • Select an option

  • Save forensicmike/747356922ef76e18efda6282e7ace745 to your computer and use it in GitHub Desktop.

Select an option

Save forensicmike/747356922ef76e18efda6282e7ace745 to your computer and use it in GitHub Desktop.
Compute the intersection of two hashsets
// The format of our checksum file is: b6f60e956ea5dc8b7056b35689b67efa *info.png
// We'll use the " *" as a mid-point and line start/end anchors (^ and $) to isolate them
Regex fileRegex = new Regex(@"^(?<hash>[\w\W]*?)\s\*(?<name>[\w\W]*?)$");
// Load hashes from the first file
var hashset1 = from line in File.ReadAllLines(@"C:\example\checksums.md5") // Each hash is separated by a newline, so ReadAllLines will do the trick here
let match = fileRegex.Match(line) // Perform the Regex
select new // Create an anonymously typed object containing the hash and filename.
{
Hash = match.Groups["hash"].Value,
FileName = match.Groups["name"].Value,
};
// Load hashes from the second file
var hashset2 = from line in File.ReadAllLines(@"C:\example\checksums2.md5")
let match = fileRegex.Match(line)
select new
{
Hash = match.Groups["hash"].Value,
FileName = match.Groups["name"].Value,
};
// Now we have two lists of objects with Hash and FileName properties.
// Through the magic of LINQ, use the Intersect method to find where the two datasets agree.
hashset1.Intersect(hashset2).Dump();
// The Dump() command is a LINQPad thing (see www.linqpad.net) but you could just as easily Console.Writeline in VS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment