originally published in March 2012
Given the modern security environment in which our networks and systems exist, organizations like incident response teams and security operations centers focus on the state of our own systems and networks. Outside of unusually large spikes, such as those from a Slammer-scale worm, global threat level is relatively uninteresting in this context because it isn't actionable. We might be interested in regular summaries of those global data (e.g. daily/weekly briefings), but not for minute-to-minute dashboards. So the sorts of cruft that many organizations throw into their dashboards really provide zero or even negative value: high-level threat intelligence or generic indicator feeds like those from ThreatExpert or DShield may have their uses, but not here. And the color-coded DHS-style "threat levels" provide even less value, because they tell us nothing about the risks in our own environments.
So what would we want to see on network security dashboards? Focus on the realities of modern network security, which means monitoring high-probability threats rather than what’s easiest to understand by executives or, worse, based on a decade-old threat model. We don’t care about port scans or negatively correlated IDS alerts (e.g. those for which we know the asset is not vulnerable). Firewall logs can provide some value, primarily as a proxy for netflow-type data. In general, if a perimeter security device blocks the attack, we don’t need to display it on a dashboard. The firewall or similar device did its job, and we care about detecting incidents rather than all attacks.
Dashboards need to present data that have two principal qualities: low false positive rates and immediate action available. The data should also allow the analyst to drill down to more detail for maximum utility, because data without context can create confusion and inefficiencies. In a general sense, we can divide up our dashboards into two types based on their appropriate level of visibility.
Primary dashboards need to be constantly visible and updated relatively frequently. These should include the types of data that would be on the large monitors visible to everyone in a SOC.
- Anomalous outbound connections or extrusion detection. Outbound flows can help identify compromised systems, once you whitelist normal connections like those from your proxy servers, mail relays, etc.
- Anomalous VPN logins, meaning those not whitelisted as coming from known-good addresses at normal times. As a first pass, you might show logins from foreign countries depending on your organization’s needs. This might be a candidate for machine learning to identify normal addresses and login times.
- Host sweep results, such as from a tool like GRR looking for indicators of compromise. Other host-based IDS systems may also provide value here, but they often have very high false-positive rates that limit their usefulness on dashboards.
- Network-based malware detection, such as FireEye or similar. Securosis has a particularly useful introduction to this tech.
- Bandwidth utilization, possibly plotted against normal or past data. Most network management teams already have a tool like MRTG. This helps you see DDOS attacks as they occur.
- Correlated IDS alerts, hopefully filtered through a SIEM so that you properly integrate asset and vulnerability data. I make a point of only listing high-confidence, high-severity alerts in my dashboards.
- WAF notifications help identify attacks using port 80 that firewalls will ignore and most regular network IDS do not handle very well. SQL injection is still a significant vector, and you ignore it at your own peril.
- Social media monitoring may deserve its own separate process, but it can play a role here as part of OSINT monitoring. Think in terms of watching pastebin and Twitter for interesting and relevant hits. For now, I’d consider this highly experimental in most organizations.
Secondary dashboards should be available for rapid review so that they can immediately present useful info to an analyst. These include summaries and visualizations that an analyst will want to have immediately available but will not need immediately visible at all times. As an example, my secondary dashboards include:
- Session lists to show logged-in users, especially on VPN. Windows domain logins can also be useful, though take care with scaling issues.
- Recent traffic, though think carefully about what to exclude here. I don’t find it useful to have a dashboard showing TCP 80 traffic to web servers, TCP 25 on SMTP relays, or UDP 53 to DNS servers. I keep those logs available, but not displayed like this. Your environment will dictate the sorts of things you want to display here, but you can start by looking at the logic for anomalous outbound connections and removing some of the filters.
- IDS data also have value on a secondary dashboard, including uncorrelated or lower-priority alerts. You may identify attackers that use ineffective vectors before they find the ones that work.
This should just present a starting point to think about your own near real-time displays and dashboards. Look for what makes sense for your organization and will enable you to detect incidents, rather than possible "security-relevant" data that seems easier to understand at first glance.