Last active
June 29, 2017 13:44
-
-
Save tsellers-r7/00d2b1a2e0ff1dc9f6005d98986a8cfd to your computer and use it in GitHub Desktop.
Use DAP to extract records from the HTTP GET study which contain a specific title
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The following: | |
- decompresses the file into a pipe | |
- transforms the base64 encoded data in the 'data' field using the DAP 'transform' filter | |
- uses the DAT 'decode_http_reply' filter to decode the HTTP response | |
- emits this as JSON | |
- Uses jq to check the '.data.http_title' field for "Invalid URL" (sorry for the dots in the name) | |
- uses pigz to compress the stream into an output file so that you don't have to store uncompressed data at any point. | |
pigz -dc <filename> | \ | |
dap json + transform data=base64decode + decode_http_reply data + json | \ | |
jq -c '. | select(."data.http_title"=="Invalid URL")' | \ | |
pigz -c > http_title_results.gz | |
The above could be optimized with: | |
- parallel since there is no requirement to maintain state between records | |
- Using the DAP filters 'remove', 'flatten', etc to remove data you don't need | |
- using grep before jq to only send records that contain '.data.http_title' to jq | |
DAP Github link: https://github.com/rapid7/dap | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment