Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save harolCalzada/b3e6b43c3f1713a6fe7022abb36620e5 to your computer and use it in GitHub Desktop.
Save harolCalzada/b3e6b43c3f1713a6fe7022abb36620e5 to your computer and use it in GitHub Desktop.

Creating a lambda layer package for WeasyPrint with Python3.8

Dockerfile

FROM lambci/lambda:build-python3.7 AS py37
FROM lambci/lambda:build-python3.8

# download libraries
RUN yum install -y yum-utils rpmdevtools
WORKDIR /tmp
RUN yumdownloader --resolve \
    expat \
    glib2 \
    libffi \
    libffi-devel \
    cairo \
    pango && \
    rpmdev-extract *rpm

# install libraries and set links
RUN mkdir /opt/lib
WORKDIR /opt/lib
RUN cp -P -R /tmp/*/usr/lib64/* /opt/lib
RUN ln libgobject-2.0.so.0 libgobject-2.0.so && \
    ln libcairo.so.2 libcairo.so && \
    ln libpango-1.0.so.0 pango-1.0 && \
    ln libpangoft2-1.0.so.0 pangoft2-1.0 && \
    ln libpangocairo-1.0.so.0 pangocairo-1.0

# copy fonts and set environment variable
COPY --from=py37 /usr/share/fonts/default /opt/fonts/default
COPY --from=py37 /etc/fonts/fonts.conf /opt/fonts/fonts.conf
RUN sed -i s:/usr/share/fonts:/opt/fonts: /opt/fonts/fonts.conf
ENV FONTCONFIG_PATH="/opt/fonts"

# install weasyprint and dependencies
WORKDIR /opt
RUN pipenv install weasyprint
RUN mkdir -p python/lib/python3.8/site-packages
RUN pipenv lock -r > requirements.txt
RUN pip install -r requirements.txt --no-deps -t python/lib/python3.8/site-packages

# remove warning about cairo < 1.15.4
WORKDIR /opt/python/lib/python3.8/site-packages/weasyprint
RUN sed -i.bak '34,40d' document.py

# run test
WORKDIR /opt
ADD test.py .
RUN pipenv run python test.py

# package lambda layer
WORKDIR /opt
RUN zip -r weasyprint-py38x.zip fonts lib python

test.py

from weasyprint import html

data = """
<!DOCTYPE html>
<html>
<body>
<h1>Hello World</h1>
<p>Just Testing.</p>
</body>
</html>
"""

HTML(string=data).write_pdf("output.pdf")

Bash script to generate lambda-layer

#!/bin/bash

set -e

PYVER=py38x

docker image build -f Dockerfile-$PYVER -t weasyprint-$PYVER .
docker create -ti --name dummy-$PYVER weasyprint-$PYVER bash
docker cp dummy-$PYVER:/opt/weasyprint-$PYVER.zip .
docker cp dummy-$PYVER:/opt/output.pdf ./output-$PYVER.pdf
docker rm dummy-$PYVER

aws lambda publish-layer-version --layer-name weasyprint-$PYVER --zip-file fileb://weasyprint-$PYVER.zip

Lambda function

Remember to set FONTCONFIG_PATH = /opt/fonts as Environment variables in the Lambda function Console.

Also, set the Memory to 256 Mb and the Timeout to 1 min 0 sec.

import json
from weasyprint import HTML

data = """
<!DOCTYPE html>
<html>
<body>
<h1>Hello World</h1>
<p>Just Testing.</p>
</body>
</html>
"""

def lambda_handler(event, context):
    HTML(string=data).write_pdf("/tmp/output.pdf")
    return {
        'statusCode': 200,
        'body': json.dumps('Success!')
    }

Output of my Lambda function

{
  "statusCode": 200,
  "body": "\"Success!\""
}

Now Using S3

policy-s3-testing-weasyprint-output

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::testing-weasyprint-output",
                "arn:aws:s3:::testing-weasyprint-output/*"
            ]
        }
    ]
}

role-lambda-testing-weasyprint-output

Create a new role and attach AWSLambdaBasicExecutionRole and policy-s3-testing-weasyprint-output policies to the lambda function.

Create the private bucket named testing-weasyprint-output on S3.

Lambda function handler

import json
import boto3
from weasyprint import HTML

s3 = boto3.client('s3')
bucket = "testing-weasyprint-output"

def lambda_handler(event, context):
    # generate PDF
    filekey = event['filekey']
    HTML(string=event['data']).write_pdf(f'/tmp/{filekey}')
    
    # upload output to S3
    with open(f'/tmp/{filekey}', 'rb') as f:
        s3.upload_fileobj(f, bucket, filekey, ExtraArgs={'ContentType': 'application/pdf'})
        
    # get download URL
    Params = {'Bucket': bucket, 'Key': filekey}
    url = s3.generate_presigned_url(ClientMethod='get_object', Params=Params, ExpiresIn=60)
    
    # results
    return {
        'statusCode': 200,
        'body': json.dumps({'url': url})
    }

The input to this function is something like this:

{
  "filekey": "test.pdf",
  "data": "<!DOCTYPE html><html><body><h1>Hello WeasyPrint!</h1><p>Just testing this nice tool.</p></body></html>"
}

The only minor annoyance is the following warning from GLib:

Log output (no dramas though)

START RequestId: 28f8d633-2647-45fe-a0ce-776effb4101e Version: $LATEST

(process:8): GLib-GObject-WARNING **: 02:30:18.438: cannot register existing type 'PangoFcFontMap'

(process:8): GLib-CRITICAL **: 02:30:18.438: g_once_init_leave: assertion 'result != 0' failed

(process:8): Pango-CRITICAL **: 02:30:18.438: pango_fc_font_map_set_config: assertion 'PANGO_IS_FC_FONT_MAP (fcfontmap)' failed
END RequestId: 28f8d633-2647-45fe-a0ce-776effb4101e
REPORT RequestId: 28f8d633-2647-45fe-a0ce-776effb4101e	Duration: 2258.65 ms	Billed Duration: 2300 ms	Memory Size: 256 MB	Max Memory Used: 78 MB	

From the return link, we get the figure below (very nice!)

w1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment