Skip to content

Instantly share code, notes, and snippets.

@romain-grecourt
Last active October 15, 2020 19:35
Show Gist options
  • Save romain-grecourt/88f5214df731b5349bfcb5eeba617db5 to your computer and use it in GitHub Desktop.
Save romain-grecourt/88f5214df731b5349bfcb5eeba617db5 to your computer and use it in GitHub Desktop.
Helidon Multipart Support

Helidon Multipart Support Proposal

Revision Helidon version Date Comment
1 0.x 02/15/2019 Initial proposal
2 1.1.1 06/07/2019 Revised proposal

Proposal

Provide an API for the reactive webserver that enables processing multipart requests and returning multipart responses.

Two models are described:

  • A high level model that is driven by an entity representing all buffered parts
  • A low level model that is fully reactive

The high level model is built on the low level model.

Examples

Let's illustrate with code examples first.

Enable Multipart Support

The readers and writers for the multipart entity types will be registered by a service, similar to how other media types are supported in the webserver. All the examples below assume that the MultiPartSupport is registered as follow:

Routing.builder()
    .register(MultiPartSupport.create())
    //...
    .build());

High Level Model

Consumes each body part as a String:

request.content().as(MultiPart.class).thenAccept(multiPart -> {
    for(BodyPart bodyPart : multiPart.bodyParts()){
        String content = bodyPart.as(String.class);
        System.out.println("Part:\n" + content);
    }
    //...
});

Get a body part named "myfile" as JsonObject:

request.content().as(MultiPart.class).thenAccept(multiPart -> {
    JsonObject json = multiPart.field("myfile")
                        .map(part -> part.as(JsonObject.class))
                        .orElse(null);
    //...
});

Get the body parts of a multi file upload as JsonObject:

request.content().as(MultiPart.class).thenAccept(multiPart -> {
    for(BodyPart part : multiPart.fields("myfile[]")) {
        JsonObject json = part.as(JsonObject.class);
        //...
    }
    //...
});

Return a multipart response

The following example returns a multipart response with 2 parts:

response.send(MultiPart.builder()
    .bodyPart(BodyPart.create("part1 body content"))
    .bodyPart(BodyPart.create("part2 body content"))
    .build());

Low Level Model

The following example consumes each part content as a JsonObject:

request.content().asPublisherOf(BodyPart.class).subscribe(new Subscriber<>(){

    @Override
    public void onSubscribe(Subscription subscription) {
        subscription.request(Long.MAX_VALUE);
    }

    @Override
    public void onNext(BodyPart part) {
        part.content().as(JsonObject.class).thenAccept(json -> {
            // ...
        });
    }

    @Override
    public void onError(Throwable error) {
    }

    @Override
    public void onComplete() {
    }
});

The following example consumes each part content using reactive stream subscribers, ServerFileWriter is inspired from the streaming example at webserver/examples/streaming:

request.content().asPublisherOf(BodyPart.class).subscribe(new Subscriber<>(){
    @Override
    public void onSubscribe(Subscription subscription) {
        subscription.request(Long.MAX_VALUE);
    }

    @Override
    public void onNext(BodyPart bodyPart) {
        bodyPart.content().subscribe(new ServerFileWriter());
    }

    @Override
    public void onError(Throwable error) {
        res.status(500);
        res.send();
    }

    @Override
    public void onComplete() {
        res.send("Files uploaded successfully");
    }
});

Return a multipart response

The following example returns a multipart response using publishers of Datachunk, ServerFileReader is inspired from the streaming example at webserver/examples/streaming:

response.send(MultiPart.builder()
    .bodyPart(BodyPart.create(
        BodyPartHeaders.builder().
            .contentType(MediaType.APPLICATION_JSON)
            .contentDisposition(ContentDisposition.builder()
                .filename("file1")
                .build())
            .build(),
        new ServerFileReader(filePath1))
    .bodyPart(BodyPart.create(
        BodyPartHeaders.builder().
            .contentType(MediaType.APPLICATION_JSON)
            .contentDisposition(ContentDisposition.builder()
                .filename("file1")
                .build())
            .build(),
        new ServerFileReader(filePath2))
    .build());

Buffering

A BodyPart instance may or may not be buffered depending on how it is created.

When creating a BodyPart instance using a Publisher<DataChunk> backed by buffered content, the buffered flag is set.

BodyPart has a method public T as(Class<T>); that can be used to consume a part content in a synchronous way for parts known to be buffered.

JsonObject json = bodyPart.as(JsonObject.class);

Is equivalent to:

try {
    return bodyPart.content.as(JsonObject.class).toCompletableFuture().get();
} catch (InterruptedException | ExecutionException ex) {
    throw new IllegalStateException(ex.getMessage(), ex);
}

Synchronous ease

Consider the counter-example example below, assuming the parts are not buffered:

request.content().as(MultiPart.class).thenAccept(multiPart -> {
    for(BodyPart bodyPart : multiPart.bodyParts()) {
        bodyPart.content().as(JsonObject.class).thenAccept(json -> {
            //...
        });
    }
    // if the parts are not buffered
    // the response will be sent before the parts are consumed
    response.send();
});

If we were to support non buffered parts with the MultiPart entity, then there would need to a be a hook to send the response only when all parts are processed. Instead of complicating the API with such a hook, the MultiPart entity aims at providing a simple model for the main usages. Most usages of multipart are likely small to be multipart/form-data with small size file upload.

By buffering the BodyPart instances returned by MultiPart, we eliminate this problem and enable a plain synchronous model:

request.content().as(MultiPart.class).thenAccept(multiPart -> {
    for(BodyPart bodyPart : multiPart.bodyParts()) {
        JsonObject json = bodyPart.as(JsonObject.class);
    }
    // will response is sent after the parts are consumed
    response.send();
});

Form Data

RFC7578 multipart/form-data mandates that the Content-Disposition header must contain a name parameter. The parameter filename is also commonly provided.

We could specialize for this use-case, which is the most common use of multipart (form based file-upload). However this would complicate the API by duplicating the entity classes (MultiPart, BodyPart, headers and content disposition classes).

This proposal does not specialize the form-data use-case and instead makes a trade-off of providing a generic multipart support. This means that the required Content-Disposition name parameter wouldn't be strictly enforced.

BodyPart has the following shorthand methods to access the name and filename parameters:

  • public String name()
  • public String filename()

Both would return null if not present.

MultiPart has shorthand methods to filter the parts based on the name parameters:

  • public Optional<BodyPart> field(String name)
  • public List<BodyPart> fields(String name)
  • public Map<String, List<BodyPart>> fields()

In the case where the parts do not have a control name, these methods would return an empty optional, an empty list and an empty map.

Part Headers

The most commonly used headers for body parts are Content-Type and Content-Disposition, both are optional, other arbitrary headers are allowed.

Similar to RequestHeader, we'd have a class BodyPartHeaders designed to solve the common scenarios. On top of the methods inherited from the Headers interface, it provides two methods:

  • public MediaType contentType()
  • public ContentDisposition contentDisposition()

Both methods do NOT return optional because it does complicate the API usage without providing much value. Optional are better suited for holding single values, not intermediate objects.

Note that the following specifications describe default values for the Content-Type header:

When the Content-Disposition is not present, contentDisposition() returns an instance of ContentDisposition representing the empty value.

Content-Disposition header

This class models the Content-Disposition parameters described in RFC2183. It is basically a type and a set of parameters:

  • name
  • filename
  • creation-date
  • modification-date
  • read-date
  • size

This class has the following methods:

  • public String type()
  • public Optional<String> name()
  • public Optional<String> filename()
  • public Optional<ZonedDateTime> modificationDate()
  • public Optional<ZonedDateTime> readDate()
  • public OptionalLong size()
  • public Map<String, String> parameters()

An instance of ContentDisposition representing an empty value would return an empty string for type(), empty optionals and an empty map.

Buffer until threshold

Since MultiPart is buffered, it is not suited for "big" files. Application supporting "big" files would require using the low-level reactive model. We could also decide to implement a mechanism that writes to disk past a certain threshold, similar to mimepull, netty's multipart support or apache commons fileupload.

Misc

We can also decide to expose some of the underlying reactive implementation. See the bulk of the POC code here.

MultiPartDecoder is the underlying Processor<DataChunk, BodyPart> that parses the request payload and publishes BodyPart.

MultiPartEncoder is the underlying Processor<BodyPart, DataChunk> that consumes BodyPart and generates the response payload.

BufferingBodyPartSubscriber is the underlying Subscriber<BodyPart> that buffers BodyPart.

Open Issues

  • Should MultiPart implement a max size limit for buffering body part content (protect against denial of service)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment