The Hillary Clinton email archives being released by the US Department of State are an intriguing data set for analysis. They’re too large to easily analyze by hand, but still small enough that we can process them on a laptop. Here I will review some of the basic techniques of retreiving the emails and then performing some basic queries on the social networks within. Note that we’ll only be using a subset of the data here, as the entire processed corpus is too large for a gist to handle. For additional details and scripts, check out my github: https://github.com/agussman/hrc-email.
External Resources
- linuxacademy.com (Training)
- Acloud.guru (training)
- Read the top-level Documentation and FAQs for all the major AWS resources (EC2, S3, RDS, Auto Scaling, etc). The answers to the "nit-picky" questions can be found here. It's also helpful to go deeper on VPCs and networking-related concepts.
- Everyone says read the white papers. The ones I read were:
- Security Best Practices https://d0.awsstatic.com/whitepapers/Security/AWS_Security_Best_Practices.pdf
- Cloud Best Practices https://d0.awsstatic.com/whitepapers/AWS_Cloud_Best_Practices.pdf
This to learn/review
-
Mandatory Access Control (MAC) vs Discrtionary Access Control (DAC)
-
RADIUS / Diameter / TACACS
-
TPM, HSM http://blogs.getcertifiedgetahead.com/tpm-hsm-hardware-encryption-devices/
-
Signing vs Encrypting?
-
Packet Headers
-
SSL? Relationship to CA?
-
Are CAs used for encryption? Or just verification?