Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save wmealing/959d9e91b86e38bdf567a363c5c94230 to your computer and use it in GitHub Desktop.
Save wmealing/959d9e91b86e38bdf567a363c5c94230 to your computer and use it in GitHub Desktop.
Prioritizing fix order of minor flaws.
Abstract
This paper intends to demonstrate how to score the importance of lower impact flaws can be chained together to allow
higher impacting vulnerabilities to be exploited correctly in a single package. A common vocabulary and scoring system
will be established and a few of the current high-profile exploit chains being used in pwn2win and chrome exploit
challenge will be explained and scored in this system to show where they lie.
1. Introduction
Software vendors analyse and score security flaws based on their existence without considering existing unfixed flaws
in the system. Significantly complex systems with long life cycles can have multiple analysts dealing with flaws and
they can be unable to keep the history of unfixed issues in mind to know which flaw to prioritize.
Using chains of exploits to attack a system is a common technique used by attackers. Chained exploits are complex and
far more difficult to defend against and project-developers do not have the birds-eye view of a system to have either
the data or the the agency to use the gathered data.
In the ideal world, every security flaw found would be fixed, this method is not a financially sustainable method of
fixing flaws. This document introduces a method of determining when to fix flaws based on the how flaws are chained
together to do reach further.
The rest of this paper first discusses related work in Section 2, and then describes our implementation in Section 3.
Section 4 describes how we evaluated our system and presents the results. Section 5 presents our conclusions and
describes future work.
2. Related Work
Tenable has a product name “Predictive prioritization”, this product is a tool that attempts to rank flaws based on the
configuration of the host in the context of its use. This is particularly useful for customized reporting based on
system configuration.
Dependency Drift attempts is a metric to classify the risk software dependencies aging, which deals with the complexity of having to update the parent package and ensuring the software still 'works' with the updated releases.
https://www.sophos.com/en-us/medialibrary/Gated-Assets/white-papers/Sophos-Comprehensive-Exploit-Prevention-wpna.pdf
https://www.amazon.com/Chained-Exploits-Advanced-Hacking-Attacks/dp/032149881X
Computer incident response teams - CISCO PRESS ( talks about it)
2.1 Other related work.
CVSSV3 scoring system is a system which classifies an individual flaw for easier understanding . It is very useful
in understanding the flaw in generic context. It has greatly standardized the terminology and understanding of attack
mechanisms and impact of the exploit.
CVSS has gone through many revisions in an attempt to better represent the flaw with clarity. They have achieved a
high quality of documentation around the scoring system and widespread industry.
3. Implementation
Traditional analysts define the risk level of each flaw as
[risk level] = [value of resource] x [likelihood of exploit]
This becomes part of the problem we’re trying to solve.
[flaw fix priority] = [likelihood of being used in chain] x [component weighting] x [risk level]
This currently only considers individual components and not libraries which are considered to be sub-components of other larger binaries. For example, glibc is a subcomponent of almost every application. This idea is tabled for a later discussion.
3.1 - Calculating the likelihood of being used in chain
From initial analysis of the writeups of a sample size of 9 writeups. The terminology is used across both kernel
and userspace. Ranked below is the order of mentions in the documentation.
* Misconfiguration ( 11% )
* Uncontrolled read ( 66% )
* Controlled Read ( 11% )
* Uncontrolled Write ( 33% )
* Controlled Write ( 22% )
Terminology such as 'buffer overflow' were grouped into 'uncontrolled write'. Multiple flaw types can be used in a
single exploit chain as a keen reader would expect..
Any any given time there is any number of known and unknown flaws in a package. A dedicated attacker who is finds a
higher impact flaw will likely need to use the known unfixed low-moderate rated flaws to defeat protection mechanisms or
leak secrets used in the higher impact flaws.
One problem that attackers face is that not all low-impact flaws are not always available through attacker accessible
paths. For this reason the greater number of low-impact flaws across a code base increases the probability that one of
these will be able to be used as part of the exploit chain.
|========================|
| Intelligence Gathering |
|========================|
|
|========================|
| Defeating mitigation |<─┬───────┐
|========================| │ │
| │ │
|========================| │ │
| Exploitation |──┘ │
|========================| │
| │
|========================| │
| Privilege Permanence |──────────┘
|========================|
|
|========================|
| Clean up |
|========================|
The graph is not always a single direction and the process can frequently require going back to previous steps. Because of this the same minor impact flaws to be repeatedly misused and their effective impact increased.
3.2 The proposed formula for weighting could be:
[weighting] = ([unfixed count] / [size] ) * [maturity rating]
Size.
The size of executable code within a component should also be a consideration. Larger codebases with few flaws are
less likely to contain a usable 'gadget' for an attacker compared to a smaller codebase with a few flaws.
Maturity.
Project maturity should also be a consideration. Larger mature codebases with few unfixed flaws would reduce the
probability that one of these known flaws would be used as part of an exploit change. Also considering the inverse,
smaller newer codebases with a few unfixed exploits would mean that there are a larger chance of one of these flaws to be
used as part of a chain if this component was to be abused.
4. Evaluation
This theory has yet to be validated, but could be tested with existing data on any significant component with a number of flaws categorized correctly.
The resulting values will provide a measure of which component is most at risk of 'degredation' that should be fixed.
5. Conclusions and Future Work
At the moment there is no method of understanding which lows and moderates in which component is worth fixing. As at organization level we should work on improving security debt and without a measure to do so any reccomendations or comparisons are blind and guesswork.
We can further improve this by looking at cross-component states when binaries rely on libraries or services.
References
https://nimbleindustries.io/2020/01/31/dependency-drift-a-metric-for-software-aging/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment