- I found 1 browser, 1 language, and 15 vulnerabilities in { Web Framework, HTTP Client library, Email library / Web Service, etc }
- All the vulnerabilities I found were found from a single perspective (I investigated maybe 50-80 products).
- The RFC description of the problem (rather confusingly) describes the requirements for this problem, while the WHATWG > HTML Spec is well documented.
- The problem is clearly targeted at the
Content-Disposition
fieldsfilename
andfilename*
. - This problem affects HTTP Request/Response/Email in different ways.
HTTP Request
: request tampering (especially with file contents, tainting of other fields, etc.)HTTP Response
: Reflect File Download vulnerabilityEmail
: Attachment tampering (e.g., extension and filename tampering and potential file content tampering)
- Not many people currently see
Content-Disposition
(filename
,filename*
) as an obvious attack vector for these attack vectors. - I haven't seen a single OWASP publication that summarizes this area properly. ASVS has an Issue on this.
- Make sure to escape
filename
andfilename*
inContent-Disposition
.filename
:"
-->\"
or%22
\r
-->\\r
or%0D
\n
-->\\n
or%0A
filename*
:- URL Encode with proper formatting
In this article, I will write about the vulnerabilities I found in the 2018-2022 research. The research period is very long, but it is simply due to a mixture of ups and downs of motivation, etc., and the length of time I have not investigated. The actual research period is about three to six months.
Research revealed vulnerabilities in the following products
- 1 browser
- 1 programming language
- 15 {Web Framework / Library / Web Service } (We dare to use ambiguous wording because it includes some items that have not been corrected yet.)
The problems we found fall into three main categories
- Content tampering vulnerability during payload generation due to insufficient escaping of
filename
on multipart/form-data > Content-Disposition in HTTP Request - Reflect File Download vulnerability due to insufficient escaping of
filename
andfilename*
in Content-Disposition header in HTTP Response - Content tampering vulnerability due to insufficient escaping of
filename
in the multipart > Content-Disposition of Email
The cause of this problem is common, only the location where each occurs is different.
When we first discovered traces of the problem, we judged it to be an implementation error on the part of the Web Framework that did not comply with the RFC. However, after reporting the problem to several Web Frameworks, we received the comment, "It must be an implementation problem on the browser side." We reported the vulnerability to the browser (Firefox), and it was successfully fixed (after two years of inactivity).
This seemed to be the end of the project, but an article by one person led me to discover a new perspective, and I was motivated to do additional research after several years.
The vulnerabilities whose contents have already been disclosed are as follows
- Firefox (No-CVE, https://bugzilla.mozilla.org/show_bug.cgi?id=1556711 )
- Python (No-CVE, python/cpython#100612 )
- apache/httpcomponents-client (Java, CVE None, Fixed)
- Sinatra (Ruby, CVE-2022-45442, Fixed)
- Ktor (Kotlin, CVE-2022-38179, Fixed)
- Django (Python, CVE-2022-36359, Fixed)
- iris (Golang, CVE None, Fixed)
- httparty (Ruby, Github Advisory (GHSA-5pq7-52mg-hr42), Fixed)
- Other (Not yet fixed)
In addition, this issue itself is just the tip of the iceberg, and many blog posts do not mention the vulnerable implementation method that this case deals with. Even the RFC is somewhat vague, and advice on countermeasures is not well known to the general public. This suggests that this issue has not been grasped by some (perhaps other than the maintainers of well-known frameworks and some security researchers).
In fact, a discussion on this issue is currently underway in OWASP ASVS v5.
This article will summarize the results of these series of studies.
Some parts of the record cannot be traced back due to the fact that quite a bit of time has passed. For this reason, I have written (vague) in ambiguous sections.
I wondered whether I should wait for all the problems I found to be fixed before publishing this article, but I decided to publish it for the following reasons.
- I made every effort we could think of, but there were cases where we were not contacted back. I sent Patch/Unit Test/Detailed Report/Reference to Security Advisory for publication, etc., but never heard from them or they were not merged.
- Vulnerability was found in a library that is not maintained (we contacted them, but they did not reply).
- Since there seems to be no public awareness of this issue of Content-Disposition in the first place, we decided that it would be better to make the issue known as soon as possible.
In fact, 99% of the implementations of Content-Disposition introduced in (Japanese) blogs etc. have no warning about this issue. Of course, there are cases where the Framework/Library implicitly performs escaping and the like. In other cases, they do nothing.
Therefore, we will provide a detailed report on the root cause of this problem. Of course, for libraries that have not been fixed, the article will not mention any names or problem areas.
Well, it was a long time ago that I found signs of a vulnerability. At the time, we were looking for vulnerabilities in our web application (made with Spring Framework) and were testing it with a simple Fuzzing file. I was testing with a simple Fuzzing file.
I was testing file uploads, etc. using files like the following
example";';.jpg
When testing file upload and other functions using such data, we found that some functions returned a 500 Error.
I was excited to see if there might be some kind of vulnerability that could allow an Injection, so we analyzed the behavior, but the results were disastrous.
After creating a minimal proof-of-concept data, it seemed that the error occurred when ";
was included.
To be sure, I checked the error log on the server side to see if it was "just a bug" or if my Injection skills were insufficient.
I found the following error logs
IllegalArgumentException("Invalid content disposition format");
Apparently, a formatting error has occurred in Content-Disposition (see below) when uploading a file.
Content-Dispositon is a header used in request/response, etc.
It is used in mutipart/form-data at the time of request and has information such as file name.
It turns out that it is not a basic problem such as Injection as originally envisioned. It's unfortunate.
However, it is not a problem that can be quickly identified just by looking at it, but one that could lead to a major vulnerability if investigated further!
So let's investigate.
Before we begin, let us preliminarily discuss Content-Disposition
and
multipart/form-data
, which are relevant to this article.
multipart/form-data
is a MIME type.
It is a kind of encoding type for forms (<form>
) available in HTML.
Besides multipart/form-data
, other encoding types that can be sent with <form>
are as follows.
- application/x-www-form-urlencoded
- multipart/form-data
- text/plain
Of these, only multipart/form-data
can send binary data (using the <form>
method).
For example, if you send binary data in the following form format, it will automatically be sent as multipart/form-data.
<form action="file_upload">
<input type="text" name="name">
<input type="file" name="avatar">
<input type="submit">
</form>
<!-- It can be written as follows -->
<!-- <form enctype="multipart/form-data"> -->
The format of multipart/form-data
is described in the following format.
※Excerpts from HTTP Header / Body. Abbreviations are marked as ...
.
HTTP Header
Content-Type: multipart/form-data; boundary=----RandomValue123
HTTP Body
------RandomValue123
Content-Disposition: form-data; name="name"
Taro
------RandomValue123
Content-Disposition: form-data; name="profile"; filename="profile.txt"
Content-Type: text/plain
Hello
World!
------RandomValue123
Content-Disposition: form-data; name="avator"; filename="avator.png"
Content-Type: image/png
{バイナリ}
------RandomValue123--
As shown above, the Content-Type
of multipart/form-data
is specified in the Header section.
In addition, a value like boundary=RandomValue123
is set as a delimiter to delimit each parameter in the Body section.
The value of boundary=----RandomValue123
is set as a delimiter to separate each parameter in the Body section.
Only the last part of the body section has the format ----RandomValue123
+ --
to indicate the end of the parameter.
This boundary
value is used to separate the parameters in the Body section.
Each separated parameter is called a part.
In the part, data may be expressed using Content-Disposition, which is the subject of this issue.
I took a detour, but now let's talk about Content-Disposition
, the subject of this article.
To be honest, I feel that reading MDN is the most accurate and easiest way to understand the subject, so I will post the URL.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition
Content-Dispotion is a header used in HTTP Response / Request (inside multipart/form-data) / Email.
In the case of Response, it controls what action is taken based on the MIME type of the target response file.
For example
- If the MIME type is
image/png
andContent-Disposition: inline
, the file is displayed in the browser. - If the MIME type is
image/png
andContent-Disposition: attachment
, download the file. - If MIME type is
image/png
andContent-Disposition: attachment; filename="abc.jpg"
, download the file as "abc.jpg". - If the MIME-type is
image/png
andContent-Disposition: attachment; filename*=utf-8''{URL-encoded filename}.jpg
, decode the file with the "URL-encoded value" and download it as Download the file as filename (non-ASCII characters are also supported)
Thus, Content-Dispotion can control how the file is handled.
Also, the filename*
field appears only in the HTTP Response case.
On the other hand, if you include it inside the multipart/form-data
of the Request
This is used to express information about the target parameter (field name or filename of the form).
For example, in the case of Content-Disposition: form-data; name="picture"; filename="filename.jpg"
, the following is used.
- parameter name is
picture
(equivalent to "name" in ) - The file name is
filename.jpg
.
The form-data
part at the beginning of Content-Disposition: form-data;
is a cliché.
The first argument in multipart/form-data at request time is always this value.
As a header for a multipart body ... The first directive is always
form-data
For your information, the fields available (and defined) for Content-Disposition
in multipart/form-data
include the following
- disposition
- disposition-type
- disposition-param
Inside these fields, the following parameters are further defined
-
disposition
- disposition-type (e.g. "form-data")
-
disposition-type
- "inline"
- "attachment"
- extension-token
-
disposition-param
- filename
- creation-date
- modification-date
- read-date
- size
- parameter
The above fields are defined in RFC 2183 [page 2].
https://datatracker.ietf.org/doc/html/rfc2183
Now, we finally have the premise in place. We will now look at why the problematic parsing error occurred.
At the beginning of this article, I mentioned that I sent the file example";';.png
to the upload function, but I did not mention the payload of the HTTP Request sent at that time, so let's take a look.
The following is the status of the request sent using "Firefox" (at that time). (*This kind of request cannot be sent using Firefox, as it has been fixed now.)
sent file:
example";';.jpg
HTTP Header of the multipart/form-data request sent:
POST /form HTTP/1
...
Content-Type: multipart/form-data; boundary=---------------------------277224600214918072883416139191
HTTP Body of multipart/form-data request sent:
-----------------------------277224600214918072883416139191
Content-Disposition: form-data; name="file"; filename="example";';.png"
Content-Type: image/png
{Binary Data}
-----------------------------277224600214918072883416139191--
Server-side errors encountered (Spring)
https://github.com/spring-projects/spring-framework/blob/4f05da7fed7e55d0744a91e4ac384d8f5df6e665/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L316
As you may have already noticed, "
is not escaped in the filename
of the request HTTP Body.
part of HTTP Header:
Content-Disposition: form-data; name="file"; filename="example";';.png"
Problem Location:
filename="example";';.png"
^
| Here
Perhaps the "
is causing the parameters to be parsed incorrectly on the server-side application.
The "
is a delimiter that separates parameters, so the fact that it is not escaped may fool the parser.
In other words, if (in some attack scenario) a crafted filename can be sent It may be possible to tamper with or spoof parameters or fields and adversely affect one of the victims.
If this is the case, we need to look at the parser code in order to know how to create a payload that can trick the parser and what fields can be exploited. First, let's look at the function part of the parser that was generating the Exception.
Note: Finally, you can see that the case targeting this Parser cannot be exploited in an attack, but it is relevant to what we will discuss later, so we will include it here in case you want to know more about it. You can skip this section if you wish.
The parse()
that caused the Exception is the parser for the field defined in RFC 2183.
Parse a {@literal Content-Disposition} header value as defined in RFC 2183.
https://github.com/spring-projects/spring-framework/blob/4f05da7fed7e55d0744a91e4ac384d8f5df6e665/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L252
This parse()
returns the ContentDiposition
type as a parsed result (parsed data).
return new ContentDisposition(type, name, filename, charset, size, creationDate, modificationDate, readDate);
- type
- name
- filename
- charset
- size
- creationDate
- modificationDate
- readDate
So, the range (fields) that can fool the parser is limited to these data. Let's look at the detailed implementation.
First, parse()
takes a String, which is a Content-Disposition
header,
and decomposes it into a List<String> parts
called parts
with the tokenize()
function.
public static ContentDisposition parse(String contentDisposition) {
List<String> parts = tokenize(contentDisposition);
The tokenize()
function checks the Delimiter [ ;
"
\\
] and disassembles the field while maintaining the flags corresponding to each Delimiter
("Is it currently inside a Quote?" "Is the current char an escape character?").
and then decompose the field, keeping track of the flags for each Delimiter.
For this tokenize()
, if you give the Content-Disposition
header string from earlier (where the browser is sending the wrong data), it will be broken down into the following set of parts.
Input:
Content-Disposition: form-data; name="file"; filename="example";';.png"
Output:
(ArrayList String)
[0] = Content-Disposition: form-data
[1] = name="file"
[2] = filename="example"
[3] = '
[4] = .png"
The Content-Disposition given in Input is mixed with "
without \\
escaping, resulting in a distorted parsing result as you can see. The parsing result is distorted as you can see.
Next, let's look at the result when it is properly escaped ( "
converted to \\"
).
(ArrayList String)
[0] = Content-Disposition: form-data
[1] = name="file"
[2] = filename="example\";';.png"
If the escaping is done properly, we see that the result is consistent with our intuition.
Now, the next question is how far we can go with this broken format (without generating errors) to fake values in predefined fields and the like. Currently, we have an Exception, so we need to create something that is not problematic from the parser's point of view (although it is in fact broken).
In parse()
, the next field pair (key,value) is extracted based on the result of tokenize()
.
- type
- name
- filename
- charset
- size
- creationDate
- modificationDate
- readDate
The detailed processing of parse()
includes the following
(Skip the shaped part(Content-Disposition: form-data
) )
- Decompose the field (find
=
for attribute/value pairs. Exception if there is no=
)
(Since there are two forms of value pairs (key=value
,key="value"
), the extraction is performed with this in mind). - If attribute is name, , store value in
name
variable - If attribute is filename* , store value in
filename
variable, taking care of'
(because it is special unlike other formats) - If attribute is filename , store value in
filename
variable - If attribute is size , store value in
size
variable - If attribute is creationDate , store value in
creationDate
variable - If attribute is modificationDate , store value in
modificationDate
variable - If attribute is read-date , store value in
readDate
variable
Now I finally figured out why the Exception was happening.
In the broken Content-Disposition described earlier, tokenize()
created a part that did not contain a single =
.
Output:
(ArrayList String)
[0] = Content-Disposition: form-data
[1] = name="file"
[2] = filename="example"
[3] = '
[4] = .png"
So, it seemed to be caught in a place inside parse()
that raises an Exception if there is no =
.
In other words.
- Generate a set of parts containing
=
by making full use of"
and so on in the file name. - Use
key=value
(orkey="value"
) format. - Set attribute to
name
,filename
,creationDate
, etc.
If we can create a file name that satisfies all of the above, it could (possibly) be used for some kind of attack. So I created the following file name.
fila name:
a.txt"; dummy=".txt
Content-Disposition generated by Firefox
Content-Disposition: form-data; name="file"; filename="a.txt"; dummy=".txt"
The parts of the result parsed by tokenize():
[0] = Content-Disposition: form-data
[1] = name="file"
[2] = filename="a.txt"
[3] = dummy=".txt"
Great, it worked.
At least I was able to fool tokenize()
without any Exception.
Now let's turn the generated dummy
into several separate fields and see the final parse()
result.
file name:
a.txt"; name="dummy"; filename="dummy"; size=1234; dummy=".txt
Content-Disposition generated by Firefox:
Content-Disposition: form-data; name="file"; filename="a.txt"; name="dummy"; filename="dummy"; size=1234; dummy=".txt"
parse() result:
type = "Content-Disposition: form-data"
name = "dummy"
filename = "a.txt"
charset = null ( Value set for filename*.)
size = {Long@438} 1234
creationDate = null
modificationDate = null
readDate = null
It works!
I was able to override the value set in the name
attribute of the form from file
to dummy
and the file name to a.txt
.
I was also able to set the file size to 1234
, and the dummy
field, which is undefined in the RFC, was ignored without causing any problems, so it is perfect.
Now all we have to do is to look at how the fields we were able to trick are used inside the Spring Framework and come up with a scenario. But after looking at the internal code, we found that most of the fields are not used inside the Spring Framework.
The only two fields that were used internally were
- fileName
- name
Still, not giving up, we also followed up on the use of these fields, but could not find any clues to an attack that could be completed in a single request.
Having no choice, we sent a report of the vulnerability so far to Spring Framework, while also looking horizontally for similar problems in other frameworks. We also found reproductions of similar issues in Ruby on Rails and Ktor and sent them in. (At this point we were unaware that it was a Firefox issue...)
As a result, as I wrote at the beginning of the report, the response was "this is not a problem on the Framework side, but on the browser side (Firefox)".
It is obvious now, but this point of view is lost when you are verifying a problem after it has been discovered.
In fact, when we tried to reproduce the problem in each browser, the problem was reproduced only in Firefox. So, we changed direction and considered whether the problem could be reported as a Firefox vulnerability.
However, there is a problem that the attack scenario is too weak to be communicated to browsers as a "vulnerability" at this stage. This is because, at this stage, we have only constructed scenarios such as "potential problem" and "parameters can be overwritten".
Therefore, we devised the following clearer scenario with the materials we have now.
1. the attacker sends a crafted value for a parameter that is set as "filename" (in a subsequent request)
2. there exists a page where the value sent by the attacker in (1) is reflected in a form field
3. the victim visits the form in (2) with Firefox .
4. the victim submits the form
5. the victim's Firefox sends a "broken Content-Disposition".
6. a parser such as Spring parses the corrupted Content-Disposition in the wrong format .
7. as a result of (6), the form field name `name` is overwritten and the paired value is sent to a different field than the original field
In other words, it is a scenario where the contents of (the field it reflects) can be contaminated by a field set up by the attacker.
One specific scenario would be as follows. (This is a slightly unreasonable scenario...)
- the attacker sends a value that becomes the filename of the form to be submitted by the victim (crafted to overwrite the form
age
) - the victim sends the form data in the profile page form where the data in (1) is sent.
- The
age
field exists in the sent form. - The file content of the file is a number such as
1000
. .
- A multipart/form-data request where the victim sends data for a
normal age field
and data for acorrupted Content-Disposition with multiple names (avator on one side, age on the other) field
. - the server side has a behavior of giving priority (overwriting) to the back side of the field if there are multiple fields. .
- the server stored the user's age as
1000
(integrity impact)
In the above scenario, the impact may seem very small. However, the impact of the above scenarios may seem very small, but the impact depends greatly on "what kind of system" and "which parameters can be affected". Therefore, it is not possible to evaluate them as "low risk" all together.
Suppose there is a parameter such as profile_html If there is a parameter such as
profile_html`, and if there is Self XSS, etc., and if data can be put there
If data can be put there, the impact will be high.
Nevertheless, this problem is a multipart/form-data
form, and
It only occurs on fairly rare pages where the attacker's data is reflected in the victim's fields.
Therefore, we believe that the conditions for the attack are high and the risk is low.
I sent an Issue to Firefox with the above information. The result was that the issue was prioritized in the Security Issue category and P3. Congratulations, I'm happy.
https://bugzilla.mozilla.org/show_bug.cgi?id=1556711
...and things happened from there.
First of all, I did not hear from them at all for about two years after I reported it. The conditions under which the problem occurred were so severe that it is not hard to understand. (At the time, Mozilla was undergoing a major restructuring, so there may have been a lack of internal resources.)
Then one day, two years after I reported it, I thought, "Maybe it's cured?" I tested it and sure enough, it was cured silently. From there, I asked, "Will a CVE be issued?" I asked. but was ignored.
Now, as a side note, I personally found an interesting part of the post-report content.
As noted in the Bugzilla exchange, the whatwg specs list requirements for escaping. https://html.spec.whatwg.org/#multipart-form-data
For field names and filenames for file fields, the result of the encoding in the previous bullet point must be escaped by replacing any 0x0A (LF) bytes with the byte sequence
%0A
, 0x0D (CR) with%0D
and 0x22 (") with%22
. The user agent must not perform any other escapes.
And the reason for the (silent) fix was that the correspondent had modified the code in a way that conformed to the HTML specification.
This was fixed in bug 1686765, as part of changing the multipart/form-data encoding to follow the HTML specification and to be compatible with Chrome and Safari. When I was working on that bug, and on the specification change that enshrined Chrome/Safari's behavior, I did not know that this was an open security bug, nor did I realize that the fact that double quotes were not escaped in Firefox's previous implementation could be exploited. But this issue has now been fixed since Firefox 90.
If it is a large system of browsers, it is not surprising that the reason is "I fixed it so that it would be correct according to the RFC of the agreement and this problem was fixed. No wonder they say, "I didn't know that fixing that part of the system was a vulnerability. I also realized once again that (although it is obvious) there are requirements in the code to prevent potential problems, and that ignoring these requirements can lead to vulnerabilities.
I was a little disappointed that Firefox didn't seem to be issuing CVEs, but then something happened that made me want to look further into Content-Disposition. It all started with a CVE. (I believe it was the following CVE. I'm a bit fuzzy on this one, as there are several similar CVEs out there).
https://nvd.nist.gov/vuln/detail/CVE-2020-5398
In Spring Framework, versions 5.2.x prior to 5.2.3, versions 5.1.x prior to 5.1.13, and versions 5.0.x prior to 5.0.16, an application is vulnerable to a reflected file download (RFD) attack when it sets a “Content-Disposition” header in the response where the filename attribute is derived from user supplied input.
At the time this CVE was published, I had read through all the CVEs issued to some extent, and was surprised to see Spring report a new CVE related to "Content-Disposition".
I was surprised because I had reported this issue to Spring as a vulnerability in the Content-Disposition parser when I had mistakenly thought that the initial issue was still a problem with the Web Framework.
At the time, I had no idea that this problem might lead to an RFD (Reflect File Download) vulnerability.
I could read it as "This could be a vulnerability based on a Content-Disposition problem I reported! At the time, I knew RFD at least by name from OWASP or other documents, so I regretted my lack of knowledge and persistence in such situations.
One day, some time later, my motivation returned and I decided to look for a similar problem.
Before proceeding with the investigation, I asked again, "Where can we attack?" I will consider the following.
First, as far as I could tell, Content-Disposition
is used in the following three ways.
- multipart in HTTP Body of HTTP Request
- HTTP Header of HTTP Response
- multipart in the HTTP Body of an Email
After considering several of these attack cases, we came up with some that could be used for attacks and some that are unlikely to be used.
The following are some of the things that can be used in an attack
- Tainting of other fields, tampering with file contents, etc. by using
filename
when sending an HTTP Request - Changing file extensions by using
filename
,filename*
when receiving HTTP Response (Reflect File Download vulnerability) - Contamination of other fields or tampering with file contents by using
filename
when sending an email.
On the other hand, we did not investigate the following cases because we judged that they could not be used for attacks.
- Problems when parsing mail with crafted Content-Disposition headers in mail (mbox), etc.
In this case, we did not proceed with the investigation because we judged that the problem occurs when a mail receiver such as Mail Client parses a crafted Content-Disposition
, and even if integrity is violated at that time, it is not the responsibility of the mail receiver. If there is an impact on availability, etc., it may be addressed, but it is outside the scope of this article's investigation.
The investigation proceeded as follows
- Determine the repository to be investigated.
- Check the repository by `disposition
- If the escaping process seems to be insufficient in the generation process of
Content-Disposition
, perform additional investigation. - Examine how to fill in the code sample for which
Content-Disposition
is generated.
- In case of Web Framework, use "File Download" to check.
- For the HTTP Client, check with `multipart
- In case of Email library, use "Attach File" or similar.
- Build an application based on the samples and test several cases
The (3) part of the search, for example, is as follows
e.g Spring https://github.com/search?q=org%3Aspring-projects+Disposition&type=code
As a result, we found vulnerabilities in 17 products (including those that are still undisclosed), as described at the beginning of this report. I will pick out the most distinctive ones and provide details on these.
From here, we will discuss the vulnerabilities that were unique among the vulnerabilities we found while analyzing the problem for each category.
This is the Firefox problem described in the first section.
When a value entered by someone else enters the HTTP Request > Content-Disposition > filename
on the victim's device (where the vulnerability exists), the content is tampered with or the field is corrupted.
This "victim terminal" includes.
- Vulnerable browsers running on the victim's terminal
- HTTP Client included in a Web App or other functionality
Possible solutions to this problem include
- If
"
can be inserted, it forces the end of thefilename
delimiter and allows the attacker to add arbitrary fields - If
\r
,\n
can be inserted, CRLF can be added in multipart/form-data.- Can add a new line in the header part of multipart/form-data's part (e.g., Content-Disposition can be appended).
- Can terminate the header part of a multipart/form-data part and move to the body part (can insert content in the head / body)
The impact of this problem will vary slightly depending on where it occurs (in the Client, such as a browser, or in a service, such as a Web App).
For example, in the case of a browser, the victim is always the browser operator, and the attacker is always an external party. In other words, the attacker tries to make the victim send a "broken Content-Disposition". I don't think this kind of form design is usually done, so the incidence is not high.
On the other hand, if the problem occurs in the HTTP Client inside the Web App, the attacker may try to send a broken Content-Disposition himself (or more precisely, in the HTTP Client of the victim). I feel that this one is still "possible"; in this world where features such as Web Hook have become commonplace, you can probably find it if you look for it.
Another way to exploit this is to bypass the Validator. For example, if there is a function such as Web Application Form Validator, it is possible to check the extension of the Form Validator's extension checks can be bypassed.
It is also possible to tamper with the internal hidden parameters by overwriting them, which may lead to an impactful attack depending on how it is used.
This is a technique that is more suited for Penetration Test and CTF, but it depends on how it is used.
httparty
is ruby's HTTP Client.
https://github.com/jnunemaker/httparty/
This case was exactly the same problem as firefox. For this reason, we will not go into details.
Please refer to the report we sent to the maintainer here, as appropriate.
https://github.com/jnunemaker/httparty/security/advisories/GHSA-5pq7-52mg-hr42
apache/httpcomponents-client is a Java HTTP Client. https://github.com/apache/httpcomponents-client
The httpcomponents-client was also caused by the same thing as Firefox and httparty.
However, there are some differences.
There are a few differences, the regression occurred from v4 to v5.
- v4 had the vulnerability addressed.
- v5 was vulnerable to this issue when major changes were made (including changes to the policies of the APIs provided)
In v4, the fix is in place as shown in this commit. https://github.com/apache/httpcomponents-client/commit/6d583c7d8cc41a188a190218a6489541b79cf35a
HTTPCLIENT-1859: Encode header name, filename appropriately
The original point of this modification is as follows.
https://www.mail-archive.com/[email protected]/msg18531.html
The ContentDisposition header, used in multi-part forms, has a name and filename subfield; these need to be escaped using unix-standard backslash character stuffing, but FormBodyPartBuilder does not currently do this. It should.
However, in v5, the library (supposedly) only provided the most core http functionality, and did not handle the lower(?) level interests such as multipart. This is because the multipart-related classes that existed up to v4 are no longer present. This means that, depending on how you look at it, this issue may be out of scope.
I believe that the maintainer has his/her own intentions on how to handle this area, so I contacted him about the matter and he was able to correct it.
The HTTP Response issue is the same as the Spring issue RFD (Reflect File Download) that we found after reporting the Firefox issue (which inspired this research).
Reflect File Download is an attack vector that first appeared in Blackhat Europe 2014. https://www.blackhat.com/docs/eu-14/materials/eu-14-Hafif-Reflected-File-Download-A-New-Web-Attack-Vector.pdf
In a nutshell, RFD (which may be incorrect exactly) is an attack in which
This is a problem where the values entered by the attacker are downloaded as a file (on the victim's terminal). This is also a problem when the attacker has control over the file extension, etc. of the file.
For example, suppose the following (good site) URL Path is accessed
/file_download?filename=abc.txt&contents=hello
In this case, the attacker would use filename=malicious.sh
, contents=#! /bin/bash.........
by creating a URL like `!
The victim downloads the malicious file even though he/she has accessed the (unproblematic) official site.
To give a rough explanation, this attack "Reflects" the parameters entered in the name of the RFD and downloads the file.
In this case, the file extension is changed when the uploaded content is downloaded, and there is Content Injection by CRLF Injection (starting from Starting with filename). (Starting with filename), or Content Injection by CRLF Injection (starting with filename), this is an RFD issue.
Django is Python's and Sinatra is Ruby's Web Framework.
They forgot to escape the "
in the HTTP Response Content-Disposition > filename
, resulting in an RFD.
https://security.snyk.io/vuln/SNYK-PYTHON-DJANGO-2968205 https://github.com/advisories/GHSA-2x8x-jmrp-phxw
Iris is Golang's Web Framework.
https://github.com/kataras/iris
In this case, the filename="..."
(an RFC violation), and the format of
filename=...
format.
Therefore, it was possible to insert another field simply by inserting the ;
character.
Ktor is Kotlin's Web Framework.
Unlike other RFDs, the problem with Ktor occurred with filename*
, not filename
.
https://security.snyk.io/vuln/SNYK-JAVA-IOKTOR-2980134
filename*
is another field of filename
available in Content-Disposition
, formatted as follows to support non-ASCII filenames when downloading files.
Content-Disposition: attachment; filename*=utf-8''{PARAMETER}
(If filename
/ filename*
are mixed, filename*
takes precedence)
In the case of Ktor, URL Encoding was not performed where URL Encoding was originally required when using a file name like the following. This is why RFD was possible in some browsers (at least Firefox).
file name:
''malicious.sh%00'normal.txt
Generated Content-Disposition:
Content-Disposition: attachment; filename*=utf-8''malicious.sh%00'normal.txt
Since the above Content-Disposition is not a normal format (originally URL Encoded), some browsers judged it as an invalid Content-Disposition and did not read it (ignoring the file name).
Incidentally, this PoC file was created by the following process.
''malicious.sh%00'normal.txt
- Insert
''
at the beginning and formatfilename*=''
(to be RFC compliant...). (I wrote this, but after all this time, it is not compliant with UTF-8, since there is no specification such as UTF-8). - By putting
'
in the middle (in firefox), for some reason it started separating filenames in a strange way (just can't replace the extension properly, perhaps some char index was off) - To make the misaligned parse position in (2) more rigid (i.e., to tell the parser, "This is the end! ), a %00 (NULL byte character) is inserted
As a result, it spits out a broken Content-Disposition, which the browser somehow tries to interpret, resulting in an RFD.
I tested for about 30 minutes to see if Chrome or Safari would also have the problem, but it didn't work.
Currently, when the file mentioned earlier is used as Input, the following Content-Disposition is generated.
Content-Disposition: attachment; filename*=utf-8''%27%27malicious.sh%2500%27normal.txt
If you use attachments in email as well, Content-Disposition is inserted in multipart.
There was a CRLF Injection problem in the Python Email module, starting with Content-Disposition > filename
.
python/cpython#100612
In this problem, unlike the others, "
was escaped, so simple field insertion seemed difficult.
However, since CRLF Injection was possible in the multipart internal part (one parameter delimiter), there seemed to be a reasonable problem (possible content insertion).
However, when writing a file open process in Python, an Exception occurs when trying to load a file containing the \r\n
character, so we determined that the impact is low.
with open("abc\r\n.txt") as f:
...
Having found multiple problems in this way, we now have some idea of the pattern of impact.
Content-Disposition: attachment; filename={PARAMETER}; # Probably RFC violation
Content-Disposition: attachment; filename="{PARAMETER}";
Content-Disposition: attachment; filename*=utf-8''{PARAMETER}
(utf-8 may be replaced by other Encode formats, etc.)
Most are implemented in these patterns.
(In some cases, filename
and filename*
are written mixed together, but this is not a problem.)
However, all of the problems were caused by the following missing escapes with respect to the filename problem.
Case: filename
:
- "
- \r
- \n
Case filename*
:
- URL Encode with proper formatting
Just to elaborate on the case of the incorrect escaping of filename*
, I have seen about 50-80 services and it only happened on one Web Framework (Ktor). So I think this is a very rare problem.
It should be encoded according to either RFC or WHATWG's HTML Spec(multipart/form-data).
In the case of RFCs, escape with \
(I'm not sure about this, because I can't find the latest version of the RFC that mentions it). .
Golang's multipart module is of this form.
https://github.com/golang/go/blob/1e7e160d070443147ee38d4de530ce904637a4f3/src/mime/multipart/writer.go#L132-L136
The WHATWG, on the other hand, performs URL Encode.
- " --> %22
- \r --> %0D
- \n --> %0A
https://html.spec.whatwg.org/#multipart-form-data
Also, there is currently an Issue on whether to add this issue to OWASP ASVS v5, so if you are in the know, please comment.
This time, we surveyed about 50-80 frameworks, libraries, etc., and reported them all so that they are generally problem-free, but this does not mean that "using a framework is safe.
This is because some languages and frameworks do not perform automatic escaping. For example, some Web frameworks provide methods for adding raw HTTP Response Header. If a file download function is implemented using such a method, the file name escaping must be implemented by the developer.
And, sadly, as of 2023, there are probably no (Japanese) articles mentioning filename escaping at all. I can look up "Web Framework name + file download" or something like that, but you won't really find any reference to escaping. So, if Web Framework or others have not implicitly fixed it, your implementation of Content-Disposition is probably vulnerable.
Also, it is not absolutely certain that you will find it even if you are doing a Web security audit, etc.
I am (or used to be) a security assessor myself, and I thought I had studied a little, but I had not even recognized this problem until I did this investigation.
Because of the above background, I believe that this problem will continue to occur like whack-a-mole problems like XSS, SQLI, etc. Therefore, I understand that there are still some products that have not been fixed yet, but I have published this article for the time being.
I will add those that have been fixed to my github repository for reporting.
[https://github.com/motoyasu-saburi/reported_vulnerability:embed:cite]
It was a long research project, but it is finally finished. It was hard work. My battle is not over yet. There are still a few products left to fix, but this is the end of it.
As written in the TLDR section, the summary is as follows.
- I found 1 browser, 1 language, and 15 vulnerabilities in { Web Framework, HTTP Client library, Email library / Web Service, etc }
- All the vulnerabilities I found were found from a single perspective (I investigated maybe 50-80 products).
- The RFC description of the problem (rather confusingly) describes the requirements for this problem, while the WHATWG > HTML Spec is well documented.
- The problem is clearly targeted at the
Content-Disposition
fieldsfilename
andfilename*
. - This problem affects HTTP Request/Response/Email in different ways.
HTTP Request
: request tampering (especially with file contents, tainting of other fields, etc.)HTTP Response
: Reflect File Download vulnerabilityEmail
: Attachment tampering (e.g., extension and filename tampering and potential file content tampering)
- Not many people currently see
Content-Disposition
(filename
,filename*
) as an obvious attack vector for these attack vectors. - I haven't seen a single OWASP publication that summarizes this area properly. ASVS has an Issue on this.
- Make sure to escape
filename
andfilename*
inContent-Disposition
.filename
:"
-->\"
or%22
\r
-->\\r
or%0D
\n
-->\\n
or%0A
filename*
:- URL Encode with proper formatting
Incidentally, in the process of writing this article, I came across someone (a GitHub employee) who is looking for a similar perspective on the issue.
https://securitylab.github.com/research/rfd-spring-mvc-CVE-2020-5398/
WHATWG HTML Spec - multipart/form-data
https://html.spec.whatwg.org/#multipart-form-data
RFC 6266 (Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)):
https://tools.ietf.org/html/rfc6266#section-5
RFC 2183 (Communicating Presentation Information in Internet Messages: The Content-Disposition Header Field)
https://datatracker.ietf.org/doc/html/rfc2183
Escape Implementation in Golang: https://github.com/golang/go/blob/1e7e160d070443147ee38d4de530ce904637a4f3/src/mime/multipart/writer.go#L132-L136
Escape Implementation in Symfony: https://github.com/symfony/symfony/blob/123b1651c4a7e219ba59074441badfac65525efe/src/Symfony/Component/HttpFoundation/HeaderUtils.php#L187-L189
Escape Implementation in Spring: https://github.com/spring-projects/spring-framework/blob/4cc91e46b210b4e4e7ed182f93994511391b54ed/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L259-L267