by Dinis Cruz and ChatGPT Deep Research, 22-Feb-2025
Overview: This whitepaper describes an architecture for capturing Amazon CloudFront requests using AWS Lambda@Edge and forwarding the logs to OpenObserve for centralized analysis. We integrate CloudFront (as the content delivery network) with a logging Lambda@Edge function that sends request details to OpenObserve, an open-source observability platform. This approach provides near real-time visibility into CDN traffic without waiting for standard CloudFront access logs (which can have delays of up to 24 hours (amazon web services - How to capture lambda @edge requests to kinesis? - Stack Overflow)). OpenObserve’s analytics engine allows us to search and visualize these logs with custom dashboards and real-time alerts (Monitoring CloudFront Access Logs with Kinesis Streams & Amazon Data Firehose: A Step-by-Step Guide | Open Source Observability Platform for Logs, Metrics, Traces, and More – Your Ultimate Dashboard for Alerts and Insights), improving the observability of our web applications.
Benefits of Lambda@Edge Logging: Using Lambda@Edge to capture requests offers several benefits:
- Real-time Logging: Unlike CloudFront’s standard logs, which are delivered periodically, a Lambda@Edge can send log data immediately per request. This reduces the time to detect issues or analyze traffic patterns.
- Rich, Custom Data: The Lambda@Edge function runs on each viewer request, allowing you to capture custom metadata (headers, client IP, CloudFront distribution ID, etc.) beyond what CloudFront’s native logs provide. You can log exactly the information needed for debugging or analytics.
- Direct Integration with OpenObserve: The function can forward logs directly to OpenObserve’s ingestion API. OpenObserve supports an Elasticsearch-compatible JSON ingestion interface (Sending logs to OpenObserve using syslog-ng - Blog), making it easy to push logs and later query them. This eliminates the need for complex pipelines (like Kinesis and Firehose) just to get CloudFront logs into an analysis system (amazon web services - How to capture lambda @edge requests to kinesis? - Stack Overflow).
- Serverless and Managed: The entire solution leverages managed services. CloudFront handles global content delivery, Lambda@Edge scales automatically at edge locations, and OpenObserve (self-hosted or cloud) stores and indexes the logs. There are no servers to manage for the logging pipeline.
Key Components:
- Amazon CloudFront Distribution: Fronts the application and static content. We will configure multiple origins (an S3 bucket and a Lambda URL for our FastAPI backend) and behaviors (path-based routing) on the distribution.
- AWS Lambda (FastAPI Application): A FastAPI web application deployed to AWS Lambda (using a container image) in the eu-west-2 region. It serves dynamic content via a Lambda Function URL, which CloudFront uses as an origin.
- Amazon S3 (Static Content): An S3 bucket holding static files (if any) to be served via CloudFront for specific path patterns.
- AWS Lambda@Edge (Logging Function): A Lambda function deployed in us-east-1 (N. Virginia) that CloudFront triggers on each request. It extracts request details (CloudFront event data) and sends a log entry to OpenObserve.
- OpenObserve: The observability platform where logs are sent. OpenObserve stores the CloudFront request logs and provides a UI and API for querying, dashboards, and alerts. It exposes a logs ingestion API (e.g., an Elasticsearch-compatible endpoint) for receiving JSON log entries (Sending logs to OpenObserve using syslog-ng - Blog).
By combining these components, each user request to CloudFront triggers the Lambda@Edge logger, which immediately streams request metadata to OpenObserve. This results in an observable CDN – we can monitor user requests, performance, and errors globally in near real-time.
This section walks through a manual setup using the AWS Management Console. It covers creating the CloudFront distribution, deploying the FastAPI Lambda, adding the Lambda@Edge logger, and verifying that logs appear in OpenObserve.
-
Create a CloudFront Distribution: In the AWS Console, navigate to CloudFront and create a new distribution. Use a Web distribution (for HTTP/S content).
-
Configure Origins: Define two origins – one for static content and one for dynamic content:
- Origin A – S3 Bucket: Point to your S3 bucket (for example,
my-website-assets
). This will serve requests for static files under a specific path (e.g.,/public-data/*
). - Origin B – Lambda Function URL: Use the URL of your FastAPI Lambda function (discussed in Step 2) as a custom origin. The Lambda Function URL is in the format
https://<url-id>.lambda-url.<region>.on.aws
(Using Amazon CloudFront with AWS Lambda as origin to accelerate your web applications | Networking & Content Delivery). This origin will handle all other routes (the application’s API endpoints).
- Origin A – S3 Bucket: Point to your S3 bucket (for example,
-
Set up Cache Behaviors: Create two cache behaviors with path pattern matching:
- Behavior 1: Path pattern
/public-data/*
– Associate this with Origin A (S3). This catches requests for static assets. Since we want to always fetch the latest data and log every request, choose a Cache Policy that disables caching. You can use the managed policy “CachingDisabled” which sets Minimum, Maximum, and Default TTL to 0 (effectively turning off caching) (Prevent Amazon CloudFront from caching certain files - AWS re:Post). Also attach a suitable Origin Request Policy (e.g., one that forwards all headers if needed, or a standard policy if you only need default headers). - Behavior 2: Default path pattern (
*
) – Associate this with Origin B (the FastAPI Lambda URL). This will catch any request not under/public-data/*
, essentially your API or dynamic routes. Also use the Managed-CachingDisabled cache policy here to ensure no caching at CloudFront for dynamic responses. For the origin request policy, you might use All Viewer (to forward all headers, query strings, etc., to the Lambda origin) if your application needs them.
- Behavior 1: Path pattern
-
Viewer Protocol Policy (HTTPS Redirect): For both behaviors, set the viewer protocol policy to “Redirect HTTP to HTTPS.” This ensures that if a user accesses the site over HTTP, CloudFront will automatically redirect them to the HTTPS URL. Enabling this keeps all traffic encrypted without needing manual redirects in your application.
-
Finalize Distribution Settings: If you plan to use a custom domain name for your distribution (as we will in Step 2), configure the Alternate Domain Name (CNAME) in the distribution settings and attach an SSL certificate (from AWS Certificate Manager in us-east-1, covering your domain). For now, you can proceed with the default CloudFront domain and we will add the custom domain via Route 53 in a later step.
-
Create the Distribution: Save and create the CloudFront distribution. Note the distribution’s domain name (e.g.,
d1234abcd.cloudfront.net
) for use when configuring DNS and testing.
Next, deploy the FastAPI application as a Lambda function and make it accessible to CloudFront:
-
Containerize the FastAPI App: Prepare a Docker image for your FastAPI application. Include a web server (e.g., Uvicorn) listening on the default Lambda container port (
8080
). Test the image locally to ensure it works. -
Push Image to ECR: Create an AWS Elastic Container Registry (ECR) repository (if not already created). Tag and push your FastAPI Docker image to ECR. This makes it available for Lambda to use. (In general, you must build the image and upload it to ECR before creating the Lambda function (Create a Lambda function using a container image).)
-
Create the Lambda Function: In the AWS Lambda console (region
eu-west-2
for this example), create a new function. Choose Container Image as the deployment package type. Select the FastAPI image from ECR. Configure the function memory and timeout as appropriate for your application. For our FastAPI API, a few hundred MB of memory and a timeout of 30 seconds might be a starting point (adjust based on performance needs). -
IAM Role for Lambda: During creation, Lambda will either create a new execution role or let you choose an existing one. Ensure this Lambda execution role has basic permissions to run the function and write logs to CloudWatch (the AWSLambdaBasicExecutionRole managed policy provides CloudWatch Logs access). Since this Lambda will be fronted by CloudFront and doesn’t need to access other AWS resources in our scenario, no special permissions are needed here beyond the basics.
-
Enable Function URL: In the Lambda configuration, enable a Function URL for this Lambda. This gives the function an HTTPS endpoint. Set Auth Type to NONE (public access) since CloudFront will be the one calling it (Using Amazon CloudFront with AWS Lambda as origin to accelerate your web applications | Networking & Content Delivery). Lambda will generate a Function URL (copy this URL). Example:
https://abcde12345.lambda-url.eu-west-2.on.aws
. Note: You can restrict this URL later using CloudFront and Origin Access Control if needed, but for initial setup, it’s open. -
Test the Lambda Function (Optional): You can invoke the Lambda via its function URL (e.g., using curl or a browser) to verify it’s working (it should return your FastAPI’s response, such as a health-check endpoint or homepage).
-
Route 53 Domain Setup: Now map your custom domain (if you have one) to CloudFront. In Route 53, create an A record for your domain (or subdomain) and set it as an Alias to the CloudFront distribution domain (not the Lambda URL). This will direct traffic hitting your domain to CloudFront. Make sure the domain is listed as an Alternate Domain Name in CloudFront and you have an ACM certificate in us-east-1 for SSL. Once this is done, your site (both static and API paths) should be accessible via the friendly domain over HTTPS, fronted by CloudFront.
With the distribution and origins in place, we now set up a Lambda@Edge function to log each request. This function will extract CloudFront’s request context and send a log entry to OpenObserve.
-
Create Lambda Function in us-east-1: Lambda@Edge functions must be created in the US East (N. Virginia) region (Restrictions on Lambda@Edge - Amazon CloudFront). Using the Lambda console, switch to us-east-1 and create a new Lambda function (e.g., name it “CloudFrontRequestLogger”). Use the Python 3.x runtime (or Node.js, but in this case we assume Python for ease of HTTP requests). The function code will be provided in the next step. You can initially create the function with a basic empty handler and adjust settings after code is added.
-
Implement Logging Code: Write the Python code to capture CloudFront event data and send logs to OpenObserve. In a Lambda@Edge viewer request event, the
event
object contains details such as: distribution ID, request ID, client IP, HTTP method, URI path, query string, headers, etc. (Lambda@Edge event structure - Amazon CloudFront) (Lambda@Edge event structure - Amazon CloudFront). Your code should parse thisevent
and construct a log record. For example, you might create a JSON object like:
log_entry = {
"distributionId": event["Records"][0]["cf"]["config"]["distributionId"],
"requestId": event["Records"][0]["cf"]["config"]["requestId"],
"clientIp": event["Records"][0]["cf"]["request"]["clientIp"],
"uri": event["Records"][0]["cf"]["request"]["uri"],
"method": event["Records"][0]["cf"]["request"]["method"],
"headers": event["Records"][0]["cf"]["request"]["headers"],
"timestamp": datetime.utcnow().isoformat() + "Z"
}
You can add other fields as needed (e.g., query string, user agent header, etc.). Once the log entry is created, send it to OpenObserve. OpenObserve can ingest logs via an HTTP API. One convenient method is to use its Elasticsearch-compatible JSON ingestion endpoint (Sending logs to OpenObserve using syslog-ng - Blog). For instance, you could make an HTTP POST request to OpenObserve like: POST https://<openobserve-url>/api/default/_multi
with the log entry in the body (or use whatever endpoint OpenObserve documentation specifies for log ingestion).
Use Python’s requests
library (or urllib3
) to send the data. Ensure you include any required authentication for OpenObserve (for example, basic auth or API key, if OpenObserve is secured). Because environment variables are not supported in Lambda@Edge functions (Restrictions on Lambda@Edge - Amazon CloudFront), any needed credentials or endpoints should be coded directly or derived from the request. Important: Wrap the HTTP call in a try/except so that logging failures do not impact the main request flow. If the OpenObserve endpoint is down or the request fails, the Lambda@Edge should catch the exception and simply proceed without blocking the viewer request (maybe log the error to CloudWatch for later debugging).
- Assign an IAM Role with Edge Permissions: The Lambda@Edge function needs an execution role that CloudFront can assume. Create (or update) the function’s IAM role to trust the
edgelambda.amazonaws.com
service principal in addition to the regular Lambda principal (Set up IAM permissions and roles for Lambda@Edge - Amazon CloudFront). In practice, if you create the function via the console and check "Enable Lambda@Edge permissions", AWS will add the necessary trust policy for you. The trust policy should look like:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": ["lambda.amazonaws.com", "edgelambda.amazonaws.com"] },
"Action": "sts:AssumeRole"
}
]
}
This allows CloudFront (via the edgelambda service) to invoke your function at edge locations (Set up IAM permissions and roles for Lambda@Edge - Amazon CloudFront). Also ensure the role has permission to write CloudWatch Logs (the same AWSLambdaBasicExecutionRole is sufficient, which the console likely attached by default). No other special permissions are required unless your function calls other AWS services.
-
Deploy the Lambda@Edge (Associate with CloudFront): After writing and testing your code (you can test it by simulating an event in the Lambda console using sample CloudFront events (How to start Lambda edge viewer request locally? - Stack Overflow)), it’s time to attach it to the CloudFront distribution. In the Lambda function console, use the “Deploy to Lambda@Edge” action. You will be prompted to specify:
- CloudFront distribution ID to attach to.
- Event trigger: choose Viewer Request (so it runs on every incoming request before CloudFront cache/origin processing).
- Cache behavior: you can usually select “*” (all requests) unless you want to scope it to a specific behavior. In our case, we want to log both static and dynamic requests, so apply it to all behaviors (the default * covers all path patterns).
- Lambda function version: you must use a published version, not
$LATEST
(Restrictions on Lambda@Edge - Amazon CloudFront). The console will prompt you to publish a new version if you haven’t already. Go ahead and publish a version of the function, and the console will use that version’s ARN for the association.
Confirm the deployment. AWS will replicate the Lambda to CloudFront edge locations (this can take a few minutes to complete). Once associated, CloudFront will invoke the Lambda@Edge on each request.
With everything in place, we should test the entire flow and verify that logs reach OpenObserve:
-
Send Test Requests: Use a web browser or tools like curl/httpie to send a few requests to your CloudFront distribution (via the custom domain or the CloudFront domain). For example, fetch a static file (
https://yourdomain.com/public-data/test.txt
) and an API endpoint (https://yourdomain.com/api/hello
). Ensure the content is served correctly (this means CloudFront is routing to the correct origin and the FastAPI Lambda is responding). -
Check OpenObserve: Log in to OpenObserve and navigate to the logs for your CloudFront distribution (depending on how you set up indexing or saved searches). You should see log entries corresponding to the requests you just made. Each log entry should contain the fields your Lambda@Edge function sent – for example, the distribution ID, path, client IP, user-agent, etc. Verify that the data looks correct. If you see the entries, congratulations – CloudFront is now feeding data into OpenObserve in real-time!
-
CloudWatch Logs (for Debugging): If logs did not appear in OpenObserve, or the Lambda@Edge isn’t behaving as expected, check CloudWatch logs for the logger function. Remember that Lambda@Edge logs are stored in CloudWatch in the regions where the function executes (Edge function logs - Amazon CloudFront). The log group will be named
/aws/lambda/us-east-1.function-name
, but within that, you’ll see log streams for various edge locations. You can go to CloudFront’s monitoring dashboard, look at the Lambda@Edge function’s metrics by region, and then check CloudWatch in those regions for error logs (Edge function logs - Amazon CloudFront). Common issues might be missing permissions or exceptions thrown in the code. Use these logs to troubleshoot. Once any issues are fixed, publish a new Lambda version and update the CloudFront association (or use the console to redeploy) to test again.
By the end of this step, you should have a functioning pipeline: every request hitting CloudFront triggers the Lambda@Edge, which logs the request details to OpenObserve. You can now observe and analyze CDN request data almost instantly after the requests occur.
The manual steps above can be automated using Infrastructure as Code and continuous deployment practices. In this section, we outline how to use AWS CloudFormation to deploy the Lambda@Edge and related resources, and how to set up a CI/CD pipeline for the FastAPI application updates.
Managing Lambda@Edge via CloudFormation requires some careful setup because the Lambda must reside in us-east-1 and be associated with a CloudFront distribution. The provided CloudFormation template (see Appendix for a YAML snippet) does the following:
-
Define the Logging Lambda Function (us-east-1): The template creates an AWS::Lambda::Function resource for the logging function. It includes the function code (which could be inline or packaged), runtime (Python 3.x), and the execution role with the trust policy for
edgelambda.amazonaws.com
(Restrictions on Lambda@Edge - Amazon CloudFront). If using CloudFormation, you might deploy this stack in us-east-1 to ensure the Lambda is in the correct region for CloudFront. -
Publish a Version of the Lambda: Because CloudFront requires an immutable version ARN, the template uses AWS::Lambda::Version on the function. Each deployment that changes the function code will create a new version. CloudFormation then references this version for association.
-
CloudFront Distribution with Lambda Association: The template defines an AWS::CloudFront::Distribution resource. In the distribution’s DefaultCacheBehavior (and/or applicable behaviors), it adds a LambdaFunctionAssociations entry pointing to the Lambda function version ARN (from the Lambda::Version). For example, in YAML:
DefaultCacheBehavior: TargetOriginId: OriginB_FastAPI ViewerProtocolPolicy: "redirect-to-https" CachePolicyId: "658327ea-f89d-4fab-a63d-7e88639e58f6" # ID for Managed-CachingDisabled OriginRequestPolicyId: "88a5eaf4-2fd4-4709-b370-b4c650ea3fcf" # ID for All Viewer (example) LambdaFunctionAssociations: - EventType: viewer-request LambdaFunctionARN: !Ref LogFunctionVersion # Ref to the published version resource
The template also sets up the two origins (S3 and Lambda URL) and behaviors as described in the manual setup. CloudFormation allows you to specify Origins and CacheBehaviors in the distribution config. The Lambda@Edge association in the template ensures the logging function is attached automatically upon stack creation/update.
-
IAM Permissions: In addition to the Lambda’s execution role, the template may need to include IAM permissions for CloudFront to successfully attach the Lambda. Specifically, when deploying via CloudFormation, the AWS CloudFormation service needs permission to create Lambda@Edge associations. Typically, your CloudFormation execution role (or the user deploying) should have the cloudfront:UpdateDistribution permission and the lambda:EnableReplication* permissions for it to replicate the Lambda to edge locations (Set up IAM permissions and roles for Lambda@Edge - Amazon CloudFront). In practice, deploying the distribution with the Lambda association in a single template will handle replication automatically, as long as the Lambda’s region and version are correct.
Using CloudFormation for this setup means you can re-create or update the entire stack consistently (e.g., in different environments or if you need to change the logging logic). The provided YAML template in the appendix can be used as a starting point and customized with your resource names and configurations.
Automating the deployment of the FastAPI application can greatly speed up iterations. A typical CI/CD workflow might look like this:
-
Source Control: Your FastAPI code resides in a repository (e.g., GitHub). Any code changes (commits/PRs) trigger the pipeline.
-
Build Stage: Use GitHub Actions or AWS CodePipeline/CodeBuild to build the Docker image. For example, a GitHub Actions workflow can pull the code, then run
docker build
to produce the image. -
Push to ECR: After building, the pipeline should log in to AWS and push the new Docker image to the ECR repository. Tag the image appropriately (you could use a commit SHA or a version number as the tag).
-
Update Lambda Function: Once the image is in ECR, the pipeline triggers a Lambda update. This can be done via AWS CLI or SDK command, for example:
aws lambda update-function-code --function-name MyFastAPIFunc --image-uri <account-id>.dkr.ecr.<region>.amazonaws.com/<repo>:<tag>
. This updates the Lambda to use the new image. AWS Lambda will automatically fetch the latest image layers (Lambda will handle pulling the image from ECR on the next invocation) (Will Lambda deployed from image pull the ECR image on every ...). -
Invalidate CloudFront Cache (If caching was enabled): In our case, we disabled caching at CloudFront. If you ever enable caching for certain paths, you’d want the pipeline to create a CloudFront invalidation for those paths after a deployment, so users get the updated content immediately.
-
Verify Deployment: The pipeline could include a test step (for example, calling a health-check endpoint on the FastAPI through the CloudFront URL) to ensure the new version is running.
This CI/CD ensures that developers can merge code and have the changes live without manual intervention. It also reduces risk by automating the build and deployment steps. The appendix includes a sample snippet of a GitHub Actions workflow that builds a Docker image and updates the Lambda.
Note: The Lambda@Edge logging function typically doesn’t change as often as the application. But if you do need to update the logging logic, you should update the CloudFormation template or Lambda code in us-east-1 and deploy a new version. Keep in mind that updating a Lambda@Edge function (publishing a new version) and associating it with CloudFront can take time to propagate globally (usually a few minutes). Plan maintenance windows accordingly if the logging function is critical.
When implementing Lambda@Edge logging to OpenObserve, consider the following best practices and caveats:
-
Security: Limit the permissions of all components. The Lambda@Edge execution role should only allow the minimal actions necessary (e.g., CloudWatch Logs writing). It should trust only the
lambda.amazonaws.com
andedgelambda.amazonaws.com
principals (Set up IAM permissions and roles for Lambda@Edge - Amazon CloudFront). Avoid hardcoding sensitive information in the code. If OpenObserve requires authentication, use an ingestion token or credentials that are read-only and specific to this logging purpose. Since environment variables are not supported in Lambda@Edge, you may need to obfuscate the credentials or use a lightweight encryption method if absolutely necessary. Also, secure the OpenObserve endpoint (e.g., restrict its IP access to CloudFront or your AWS ranges, or use TLS and authentication) so that third parties cannot send fake logs. -
Performance Impact: Be mindful that the Lambda@Edge runs synchronously on the request path. Every millisecond it spends will add to your request’s latency seen by the user. The logging function’s network call to OpenObserve could slow down responses if not optimized. To mitigate this:
- Keep the log payload small (only essential fields).
- Use non-blocking or async calls if possible, or at least ensure the HTTP client is fast. In many cases, the overhead will be small (tens of milliseconds) but at scale it could add up.
- Monitor the duration of the Lambda@Edge execution in CloudFront metrics. If it becomes high, consider alternatives.
- Alternative Approaches: For very high traffic sites or performance-critical scenarios, you might leverage CloudFront’s real-time logs feature instead of Lambda@Edge. Real-time logs can push request data to a Kinesis Data Stream within seconds (amazon web services - How to capture lambda @edge requests to kinesis? - Stack Overflow), which you can then feed into OpenObserve (as described in an OpenObserve blog using Kinesis Firehose (Monitoring CloudFront Access Logs with Kinesis Streams & Amazon Data Firehose: A Step-by-Step Guide | Open Source Observability Platform for Logs, Metrics, Traces, and More – Your Ultimate Dashboard for Alerts and Insights)). This decouples logging from the request path. However, it adds complexity (managing Kinesis and Firehose). Weigh the trade-offs – Lambda@Edge is simpler to deploy for moderate traffic, whereas Kinesis-based real-time logs might handle extreme scale more gracefully without impacting user latency.
-
Cost Considerations: Lambda@Edge invocations cost money per request, and sending data out to OpenObserve might incur data transfer fees (if OpenObserve is external to AWS or in a different region). Keep an eye on:
- Lambda@Edge costs: You are billed for execution time and requests. The logging function should be lightweight to minimize execution time costs.
- Data Transfer: If OpenObserve is running outside of AWS or in a different AWS region than the CloudFront edge location, those HTTP calls could count as external data transfer. Ideally, run OpenObserve in a region that’s frequently an origin or near your users, or consider deploying OpenObserve in AWS.
- OpenObserve storage costs: More logs means more storage and indexing. Make sure to set appropriate retention policies in OpenObserve to drop older data if you don’t need it, or archive to cheaper storage.
-
Scalability: This solution will scale automatically with traffic, up to the limits of Lambda@Edge and OpenObserve:
- CloudFront and Lambda@Edge can handle very high request rates. Ensure your Lambda@Edge code is efficient and stateless. AWS handles scaling the function across edge locations (with regional concurrency limits per function (Restrictions on Lambda@Edge - Amazon CloudFront), which are usually high by default).
- OpenObserve needs to handle the ingress of log events. If using a self-hosted OpenObserve, monitor its ingestion rate and performance. You might need to run OpenObserve in a clustered mode for high volumes, or use OpenObserve Cloud offering for scalability.
- Test the system under load to ensure that the added latency per request is negligible and that OpenObserve can keep up with the log volume.
-
Observability of the Logger: It might sound meta, but ensure you can monitor the logging lambda itself. Use CloudWatch metrics (e.g., Lambda errors, duration) and logs to catch issues. Set up CloudWatch Alarms for any invocation errors or throttling on the Lambda@Edge function. If the function starts failing, you’d want to know quickly (via an alert) since that could mean missing log data.
-
Deployment and Versioning: When updating the Lambda@Edge function’s code, remember that CloudFront will continue invoking the old version until you explicitly update the association to a new version. It’s wise to test new versions thoroughly (perhaps in a staging distribution) before switching over in production. Keep previous versions around until you verify the new one works, so you can roll back if needed.
-
Compliance and Privacy: Since this setup logs every request (including potentially user identifiers, IP addresses, URLs accessed, etc.), ensure that you handle this data in compliance with privacy laws and policies. If needed, you can mask or omit sensitive data in the Lambda@Edge code before sending to OpenObserve. For example, you might not want to log query parameters that contain personal data. Tailor the logging content to balance observability with privacy.
In summary, follow the principle of least privilege for security, measure the performance impact, and build safety nets (monitoring/alerts) for your logging pipeline. When done correctly, Lambda@Edge logging can be a powerful tool that operates transparently, with users unaware that each request is being logged and analyzed in real-time behind the scenes.
Implementing AWS Lambda@Edge logging to OpenObserve provides a robust solution for real-time monitoring of CloudFront traffic. We began by setting up a CloudFront distribution with multiple origins (serving both static and dynamic content), and introduced a Lambda@Edge function to intercept requests at the edge. This function enriches our observability by capturing detailed request metadata and sending it to OpenObserve immediately.
The integration with OpenObserve turns raw request data into actionable insights – you can leverage OpenObserve’s search, analytics, and alerting to track user behavior, monitor performance, and detect anomalies across your CDN layer (Monitoring CloudFront Access Logs with Kinesis Streams & Amazon Data Firehose: A Step-by-Step Guide | Open Source Observability Platform for Logs, Metrics, Traces, and More – Your Ultimate Dashboard for Alerts and Insights). By logging at the edge, we ensured minimal latency in capturing data and avoided the lag of traditional log delivery mechanisms.
Summary of Benefits: This setup demonstrates how serverless technology can be used to improve transparency of a serverless CDN:
- We achieved real-time logging without managing any servers or heavy pipelines.
- The solution is highly scalable and globally distributed, thanks to CloudFront and Lambda@Edge.
- OpenObserve, being an open-source observability platform, reduces reliance on proprietary logging systems and can be cost-effective for large volumes of data.
Next Steps: Organizations adopting this pattern can extend it in various ways. For example:
- Enhance Log Content: Incorporate response information or custom headers (like trace IDs) into the logs by using an origin-response Lambda@Edge trigger in addition to viewer-request, to log status codes or response times.
- Integrate Alerts and Dashboards: Use OpenObserve to set up real-time alerts (for high error rates, unusual traffic spikes, etc.) so that operations teams are notified promptly. Build dashboards showing metrics like request count by country, top requested URLs, cache versus origin hits (if you enable some caching), etc.
- Optimize and Harden: If the logging volume grows, consider switching to CloudFront real-time logs with Kinesis as mentioned, or implement batching in the Lambda@Edge (though note, Lambda@Edge has size/time limits that constrain how much you can batch in one invocation).
- Infrastructure as Code: Finalize the CloudFormation templates or Terraform scripts for the entire stack, and integrate deployment of the logging function into your CI/CD process as well, ensuring that any changes go through code review and automated testing.
By implementing Lambda@Edge logging to OpenObserve, you gain a high degree of observability into your content delivery network. This leads to faster debugging, better security monitoring, and insights that can drive improvements in user experience. The combination of AWS’s edge computing and OpenObserve’s analytics proves to be a powerful synergy for modern cloud architectures.
Below is an excerpt from a CloudFormation template that sets up the Lambda@Edge logging function and attaches it to a CloudFront distribution. This snippet omits some details for brevity (like the full AWS::CloudFront::Distribution properties and the S3 origin setup), but highlights the critical parts:
AWSTemplateFormatVersion: "2010-09-09"
Description: "CloudFront with Lambda@Edge logger and FastAPI origin"
Parameters:
LambdaFunctionCodeS3Bucket:
Type: String
Description: "S3 bucket containing the Lambda@Edge deployment package"
LambdaFunctionCodeS3Key:
Type: String
Description: "S3 key for the Lambda@Edge deployment package (zip file)"
Resources:
## IAM role for Lambda@Edge ##
EdgeLambdaExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
- edgelambda.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
## Lambda@Edge Function ##
LogRequestsEdgeFunction:
Type: AWS::Lambda::Function
Properties:
FunctionName: "CloudFrontRequestLogger"
Handler: "index.handler"
Runtime: python3.9
Role: !GetAtt EdgeLambdaExecutionRole.Arn
Code:
S3Bucket: !Ref LambdaFunctionCodeS3Bucket
S3Key: !Ref LambdaFunctionCodeS3Key
MemorySize: 128
Timeout: 5 # short timeout since this is quick log forwarding
# Environment: (Not allowed for Lambda@Edge per AWS restrictions)
## Publish a Version of the Lambda (required for CloudFront) ##
LogRequestsEdgeFunctionVersion:
Type: AWS::Lambda::Version
Properties:
FunctionName: !Ref LogRequestsEdgeFunction
## CloudFront Distribution ##
CloudFrontDistribution:
Type: AWS::CloudFront::Distribution
Properties:
DistributionConfig:
Enabled: true
Aliases: ["yourdomain.com"] # optional custom domain
DefaultRootObject: "index.html"
Origins:
- Id: Origin1_S3
DomainName: your-bucket.s3.amazonaws.com
S3OriginConfig: {}
- Id: Origin2_LambdaURL
DomainName: "<Lambda-Function-URL-without-https>"
CustomOriginConfig:
OriginProtocolPolicy: https-only
DefaultCacheBehavior:
TargetOriginId: Origin2_LambdaURL
ViewerProtocolPolicy: redirect-to-https
CachePolicyId: "658327ea-f89d-4fab-a63d-7e88639e58f6" # Managed-CachingDisabled
OriginRequestPolicyId: "88a5eaf4-2fd4-4709-b370-b4c650ea3fcf" # All Viewer (for example)
LambdaFunctionAssociations:
- EventType: viewer-request
LambdaFunctionARN: !Ref LogRequestsEdgeFunctionVersion
CacheBehaviors:
- PathPattern: "/public-data/*"
TargetOriginId: Origin1_S3
ViewerProtocolPolicy: redirect-to-https
CachePolicyId: "658327ea-f89d-4fab-a63d-7e88639e58f6" # Managed-CachingDisabled
OriginRequestPolicyId: "88a5eaf4-2fd4-4709-b370-b4c650ea3fcf" # All Viewer
LambdaFunctionAssociations:
- EventType: viewer-request
LambdaFunctionARN: !Ref LogRequestsEdgeFunctionVersion
# ... (Logging, PriceClass, etc., as needed)
ViewerCertificate:
AcmCertificateArn: "arn:aws:acm:us-east-1:123456789012:certificate/abcde-1234-5678-9012-abcdef"
SslSupportMethod: "sni-only"
MinimumProtocolVersion: "TLSv1.2_2018"
Notes: In this template snippet:
- We assume the Lambda@Edge code is packaged as a ZIP and uploaded to S3 (CloudFormation will pull it from there). Alternatively, you could use inline code or an AWS::Lambda::Function Code with ZipFile (if the code is short).
- The Lambda Function URL origin is represented by its domain name (without the
https://
). Also, for a Lambda URL origin, you might need to ensure the origin request policy forwards the Host header, or set a custom Origin Custom Header for Host because the Lambda URL expects its specific host. In practice, setting the CloudFront origin to the exact Lambda URL domain and forwarding Host as is should work. - Managed policy IDs for CachePolicyId and OriginRequestPolicyId are used (these IDs correspond to AWS’s predefined policies like CachingDisabled and AllViewer). In a real template, you can reference them by name using
AWS::CloudFront::CachePolicy
resource or supply the IDs directly as above. - The LambdaFunctionAssociations is applied to both the default behavior and the specific
/public-data/*
behavior, so that all requests trigger the logging function. If you wanted to exclude logging for the static content (to reduce log volume), you could attach the Lambda@Edge only to the default behavior.
This CloudFormation template can be deployed with the AWS CLI or through the CloudFormation console. Ensure you deploy it in us-east-1 (because it contains the Lambda@Edge function creation).
Below is a conceptual example of a GitHub Actions workflow for building and deploying the FastAPI application Lambda (container image). This assumes you have AWS credentials configured in the repository secrets and appropriate IAM permissions to push to ECR and update Lambda.
name: Build and Deploy FastAPI
on:
push:
branches: [ main ]
env:
AWS_REGION: eu-west-2
ECR_REPOSITORY: my-fastapi-repo
FUNCTION_NAME: MyFastAPIFunction
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Set up AWS CLI
uses: aws-actions/configure-aws-cli@v1
with:
aws-region: ${{ env.AWS_REGION }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
- name: Log in to ECR
run: |
aws ecr get-login-password --region $AWS_REGION | docker login --username AWS --password-stdin ${AWS_ACCOUNT_ID}.dkr.ecr.$AWS_REGION.amazonaws.com
- name: Build Docker Image
run: |
docker build -t ${AWS_ACCOUNT_ID}.dkr.ecr.$AWS_REGION.amazonaws.com/$ECR_REPOSITORY:latest .
- name: Push Image to ECR
run: |
docker push ${AWS_ACCOUNT_ID}.dkr.ecr.$AWS_REGION.amazonaws.com/$ECR_REPOSITORY:latest
- name: Update Lambda Function
run: |
aws lambda update-function-code --function-name $FUNCTION_NAME \
--image-uri ${AWS_ACCOUNT_ID}.dkr.ecr.$AWS_REGION.amazonaws.com/$ECR_REPOSITORY:latest
Explanation:
- The workflow triggers on pushes to the main branch.
- It logs into AWS and ECR, builds the Docker image for the FastAPI app, then pushes it to ECR with the
latest
tag. - Finally, it updates the Lambda function code to use the new image. Using the
latest
tag is simple but in production you might use versioned tags (and provide that tag name to the update-function-code command) to avoid any ambiguity of which code is deployed. - You would replace
AWS_ACCOUNT_ID
, and possibly use a secret or environment variable for that as well. - This example uses AWS CLI; alternatively, one could use the AWS Lambda GitHub Action or AWS SAM CLI, etc.
After this pipeline runs, the Lambda function in eu-west-2 is updated with the new FastAPI code. CloudFront will start serving the new version immediately (since CloudFront always fetches from the Lambda URL on each request due to caching being disabled).
You can extend this workflow with steps to run tests, do security scans on the image, or notify a Slack channel upon deployment. If using CodePipeline, the stages would be analogous: Source -> Build (containerize) -> Deploy (push and update Lambda).
By following the guidelines and examples in this whitepaper, you can successfully implement and automate a global request logging system for CloudFront using Lambda@Edge and OpenObserve, achieving greater visibility into your content delivery with a modern, serverless approach.