Cloud Native Rejekts NA 24 | Flex Room | Day 2

Collecting and Processing Logs on Kubernetes Environments Using the OpenTelemetry Collector

Christos Marco, a software engineer at Elastic NV, discusses collecting and processing logs on Kubernetes environments using the OpenTelemetry collector. (00:11:06)
OpenTelemetry is an observability framework that provides specifications and implementations for observability, allowing users to instrument applications and collect telemetry signals. (00:12:18)
The OpenTelemetry collector can be used to receive, process, and export telemetry data to other systems, and it consists of pipelines that can be configured with receivers, processors, and exporters. (00:14:09)
To collect container logs from Kubernetes environments, the collector can read logs directly from files on disk or use OpenTelemetry SDKs to instrument applications and capture logs. (00:14:50)
A new parser for the Filelog receiver, called the container parser, has been introduced to simplify the logic for parsing container logs and extracting Kubernetes metadata. (00:19:14)
The F Lo receiver is useful for collecting logs from containers and Kubernetes clusters, but it has limitations, such as not being able to guarantee the order of log arrival, which can make it difficult to recombine them properly (00:20:43).
To further enrich logs with additional signals and context-specific information, the open Software development kit can be used to instrument applications and capture their logs, producing nicely structured logs that follow the open Telemetry log data model (00:21:30).
Combining the F Lo receiver with the open Telemetry SDKs can help solve problems like log duplication and extra overhead, by configuring the SDKs to use the standard output log exporter and writing the open Telemetry records to the console, which can then be collected by the F Lo receiver (00:22:30).
The collected logs can be further processed and deserialized using a transform processor and an LLP JSON connector, producing a valid One-time password log record that can be exported to an observability backend (00:24:12).
A demo is shown, using the open Telemetry demo application, which consists of multiple microservices instrumented with the open Telemetry SDK, and The Collector, which collects logs from the files on the disk and exports them to the console for debugging reasons (00:26:35).
The discussion revolves around the OpenTelemetry Collector and its features, including support for additional formats like the Kog structured logging format used by Kubernetes control plane components and operators. (00:32:29)
The Collector will be able to automatically enable log collection based on hints provided as pod annotations, allowing for better user experience. (00:33:09)
There are plans to support more well-known formats from well-known technologies and provide a way to package them better for users. (00:33:27)
The Flog receiver was produced by the donation of the Stanza agent or library of ObserveIQ, and it runs the Stanza receiver under the hood. (00:35:36)
There are no plans to convert the Stanza operators to Hotel processors due to limitations, such as the inability to guarantee the order of the whole pipeline. (00:35:33)

Database Deployment with Percona Everest

A presentation by Yan Vich and Panak discusses database deployment, specifically Percona, which has been providing free and open software for about 20 years. (00:45:45)
The presentation highlights the pains of using cloud-managed databases, including high costs, security and data privacy concerns, limited customization, and vendor lock-in. (00:48:21)
Users expect a reliable, cost-effective, and user-friendly database deployment, but achieving this in Kubernetes can be challenging due to the complexity of choosing the right operator (00:50:14).
There are many operators available, each with their own approach to life cycle management, and choosing the right one can be difficult (00:52:51).
Even if an operator fits well today, it may not be the best choice in the future, and replacing it can be painful (00:53:14).
High availability is a critical value, but different operators have different approaches to achieving it, making it a nightmare to implement and manage (00:55:27).
The number of versions of popular operators is constantly changing, making it hard to keep up with the latest developments (00:57:21).
Despite the challenges, Kubernetes and open-source solutions like PostgreSQL offer great benefits, including scalability, high availability, and freedom from vendor lock-in (00:59:18).
Percona is working on Percona Everest, a solution to simplify database deployment and management, aiming to provide the best of both worlds in terms of experience and user needs (01:01:00).
The goal of Percona Everest is to allow users to deploy their databases through a UI, design replica sets, define affinity configuration, create backup policies, and more, while the operator handles the heavy lifting (01:03:07).
Percona Everest is open-source and currently being developed to deliver it to the scale that users can use it on production for all of their databases, with the company seeking feedback from users (01:03:45).
The main problem with open-source projects is that people often don't feel entitled to ask for more, so Percona is looking for feedback on what's missing, what's great, and what can be improved (01:04:21).
Percona Everest's UI is open-source, written in TypeScript and React, and can be integrated into other platforms like Backstage (01:06:56).
Percona provides free and open-source software, with no non-open-source software, and believes that open-source is an adoption technique, not a way to make someone use proprietary software (01:07:50).
Percona sees Percona Everest as similar to other Cloud Native Computing Foundation projects like Crossplane, but with a different approach, and believes that multiple solutions can fill the deployment issue gap (01:09:02).
Open core products often have day two operations that require a license, limiting the open source deployment, (01:10:29)
Everest aims to make everything open source, and it is working on making its database engines, operators, and Everest operator arm-ready, (01:11:23)
Everest exposes API through its database engines, but not directly, and users can connect to the database using the connection URL for tasks like backup and restore, (01:14:12)
Multicluster support is on Everest's roadmap, and it will allow managing multiple Kubernetes clusters from a single Everest instance, (01:14:59)
Everest will release Helm (package manager) charts in November, enabling integration with Argo CD and flux, (01:15:46)

Combining WebAssembly and Unikernels for Enhanced Application Performance

WebAssembly (wasm) is a binary instruction format that can run code written in any language or framework, making it useful for server-side applications, (01:21:53)
WebAssembly (wasm) aims to deliver near-native performance, portability, and security by replicating the performance of Assembly language and running in a sandboxed environment (01:23:44).
Unikernels strip down the base operating system to the kernel and necessary libraries, creating a smaller unit that can be used to run applications with better performance, portability, and security (01:25:50).
Unikernels and wasm approach the same promise from different angles, with wasm taking an application-first approach and unikernels focusing on minimizing the operating system for Ops folks (01:29:50).
Combining wasm and unikernels can result in a single image that gains the performance benefits of unikernels and the portability of wasm, allowing for fast boot times and quick execution (01:32:23).
Unikernels are a technology that can provide benefits such as performance, portability, and security, but their full potential is not yet realized (01:34:34).
Different Unikernel communities, such as Unicraft, Nanos, and Hermit, are working on various projects, and some, like Firecracker, are popular for their ability to help realize the benefits of unikernels (01:35:45).
Unikernels have a minimal attack surface due to their small size, making them a secure option, and they can also isolate individual applications effectively (01:37:31).
WebAssembly (WASM) has a sandbox feature that provides memory safety and other security benefits, but it lacks native or automatic integrity checks, which is a challenge (01:39:05).
The combination of unikernels and WASM seems like the best bet for achieving performance, portability, and security at the application and infrastructure levels (01:41:26).
Unikernels have a tightly coupled nature, making it difficult to generate images, and creating a better developer experience around this is needed (01:42:23).
The community is working on improving the developer experience, but it's complicated by the need for awareness from both the application and infrastructure sides (01:42:44).
The combination of unikernels and WASM can provide cost savings and improved developer experience, but it requires investment in redesigning systems and processes (01:45:06).
Uni kernels can run on bare metal, but the architecture is capable of running on hypervisors, and there's a way to compile a uni kernel to run on bare metal (01:46:22).
Uni kernels are intended to run different applications on different kinds of infrastructure, not necessarily orchestrated as containers (01:47:27).
Kubernetes is a good way to run on any architecture, but it's slightly independent of what uni kernels are trying to do with the underlying technology pieces (01:48:27).
There are Edge and Internet of things applications that run in production with uni kernels, but few examples of cloud-based applications (01:48:46).
Uni kernels can be used to experiment with a good architecture (01:49:14).
A question about running uni kernels inside Kubernetes using cert as another way to orchestrate virtual machines was met with uncertainty (01:49:21).

Intuit's AI-Native Development Platform and Air Abstraction Layer

Intuit is a global fintech company that builds several financial products based on an AI-driven expert platform, serving around 100 million customers across financial products like TurboTax, QuickBooks, and Credit Karma (02:11:21).
Intuit's AI native development platform has four major pillars: AI-powered app experiences, AI-assisted development, AI-powered app-centric runtime, and Smart operations using IT operations analytics (02:12:01).
The platform has two personas: service developers who focus on building app logic and shipping it faster, and platform experts who expose interfaces and capabilities to service developers (02:12:51).
Challenges at Intuit include a steep learning curve for developers to understand Kubernetes internals and API, a long inner loop cycle for local development with dependencies, and tech refreshers and migrations that disrupt developer workflows (02:13:51).
Intuit's solution is to abstract the Kubernetes layer, making it easier for developers to manage applications without needing to understand Kubernetes (02:13:30).
The current state of deployment involves multiple steps, including developing and deploying an app, understanding Kubernetes, configuring Kubernetes primitives, onboarding to API management, and potentially onboarding to service mesh, which can be overwhelming for developers (02:16:17).
The goal is to create an abstracted platform, called "Air," that allows developers to focus solely on their app logic, without worrying about the underlying infrastructure (02:17:21).
Air has three broad categories: application-centric runtime, traffic management, and debugging tools, which aim to provide a hands-free experience for developers (02:19:05).
The application specification in Air is comprised of components and traits, influenced by the Open Application Model, which captures the user's intent and translates it into infrastructure configurations (02:20:15).
The migration process from a custom platform to Air involves a self-serve experience, where developers can begin migration, and the system checks if migration is allowed, creates an AirStack, and allows users to dial traffic between custom and Air (02:23:37).
There are different scenarios that can occur during migration, including successful migration, migration not allowed due to anti-patterns, and failed testing, which require different actions from the user (02:24:40).
The migration process involves 35-40 checks behind the scenes before a user can migrate their services, including compatibility checks and validation of app properties (02:26:24).
Once the validation is passed, the system creates the entire air stack through a process called inspection, which involves getting information about the user's policies, cluster, and current deployment settings (02:27:12).
The system translates the user's YAML files into AppSpec, which is used to deploy the application to the air environment (02:28:21).
The user can then dial traffic between the old and new environments using a UI, and the system validates whether the app is running or not (02:30:00).
The pipeline for deploying code changes remains the same, with the addition of a new step for deploying to the air environment (02:33:08).
The system allows for iterative migration, with the ability to cancel or complete the migration at any time (02:33:41).
Future enhancements include having a single pipeline, better UX for traffic dialing, and batch validation across fleets (02:35:42).
The goal is to achieve 80% adoption, focusing on general standards and practices for the majority of the group, while acknowledging that some users may have custom needs that don't fit the platform (02:39:39).
The abstraction layer is not yet leveraging AI, but it's being considered for the future to make the model more generic (02:41:44).
The platform is being designed to be more open, with the company being heavily invested in open source and creating a lot of open-source projects (02:38:21).
The company is working on enhancements for autoscaling, including exploring the option of auto-approving deployments and pivoting to a central DB plus UI approach (02:37:56).
The platform is designed to support multiple environments, with the exact same environments being vended out for the new platform, and it's up to the developers to decide when to start taking traffic on the new environment (02:40:50).

Runtime Detection and Response in Kubernetes with Falco, Tetragon, and CubeScape

The speaker is Ben, one of the maintainers of the Cloud Native Computing Foundation project CubeScape, and will be discussing open-source projects around Kubernetes and threat detection. (05:16:14)
The focus will be on runtime detection and response, and three CNCF projects will be presented: Falco, Tetragon, and CubeScape. (05:17:33)
The presentation will cover three attack scenarios and how these projects behave under these scenarios, and will evaluate their value. (05:17:46)
The speaker notes that the security world has shifted from agentless and posture to runtime detection and response, and that this is a healthy development. (05:16:56)
The presentation will focus on Kubernetes, container workloads, and application security, and will not cover all layers of security. (05:20:50)
Three tools, Falco, Tetragon, and Cilium's Cilium (computing) Service Mesh (referred to as Cub Escape in the text), are compared in terms of their approaches to runtime detection, with Falco focusing on rules, Tetragon evaluating events inside the kernel, and Cub Escape baselining application behavior and alerting on deviations (05:24:03).
The first attack scenario involves a command injection attack, where an attacker sends a payload to an application, which spawns a process that dumps the service account token, and the tools' behaviors are analyzed, with Falco not alerting by default, Tetragon requiring configuration, and Cub Escape alerting on unexpected processes and file access (05:26:08).
The second attack scenario involves a server-side request forgery attack, where an attacker sends a payload that causes the application to send a request to Amazon Web Services, and the tools' behaviors are analyzed, with Falco and Tetragon requiring configuration to alert, and Cub Escape automatically alerting on unexpected egress communication and access to the metadata service (05:31:25).
The third attack scenario involves a supply chain attack, where an attacker injects malware into a container image, and the tools' behaviors are analyzed, with Falco and Tetragon requiring configuration to alert, and Cub Escape alerting on new processes and unexpected external communications if the malware is activated after the learning period (05:33:38).
The tools are compared in terms of their configuration requirements, with Falco being configuration-heavy, and Cub Escape requiring less configuration (05:34:53).
Falco, Tetragon, and Cube are compared in terms of their ability to detect and alert on suspicious behavior, with Falco being very customizable but hard on CPU, Tetragon being low on CPU and RAM, and Cube Escape being low on CPU but using more memory. (05:35:33)
The tools have different approaches to configuration, with Falco having a policy language, Tetragon having a rule language, and Cube Escape using baselines that can be extended but require writing new detection rules in Go. (05:36:47)
Explainable artificial intelligence is also compared, with Falco and Tetragon being clear about what they are alerting on, while Cube Escape requires more investigation to understand the alert. (05:39:09)
Cube Escape has a predefined learning period of 24 hours, during which it writes the behavior of workloads into application profiles, which are then used for detection. (05:44:11)
The tools have different performance characteristics, with Falco being high on CPU, Tetragon being low on CPU and RAM, and Cube Escape being low on CPU but using more memory. (05:40:45)
Prevention is also discussed, with the tools being able to alert on malicious behavior but not necessarily stopping it, although some tools like EBPF can be used for prevention. (05:47:03)

Model Serving with Kserve

Model serving is an integral part of the machine learning life cycle, involving deploying models as microservices and exposing them through API or endpoints for applications to consume (05:55:04).
Deploying large language models or any models for inferencing requires considerations such as containerization, inference response times, and support for different model runtimes (05:56:51).
Kserve is a One-Stop solution for any kind of model runtime, supporting different ML runtimes, performance-standardized inference protocols, and serverless constructs (05:59:09).
Kserve can be used to build application platforms on top of it, supporting multiple types of model runtimes, and providing additional capabilities like pre and postprocessing (05:59:42).
Model CarCI can help simplify model serving in production, making it more efficient to deploy and scale models (05:54:57).
Kserve is a productive and generative model inference platform that uses K Native and STO as a serverless plane on top of community infrastructure, with the goal of making it easier for platform engineers to make it useful for Dev teams and data science teams to consume (06:00:48).
Kserve supports different types of storages, including S3, object storage, file system, container registry, and Hugging Face repository, and allows users to specify the runtime to use or automatically choose one based on the model format (06:02:49).
Kserve Large language model runtime supports two different serving runtimes: Hugging Face LLM Service and TOT Serve LLM, with the Hugging Face runtime implementing two backends: Hugging Face Explorator and VM (06:03:57).
The demo showcases how to deploy a large language model using the Hugging Face LLM serving runtime on Kserve, specifying the model format, tokens, and resource requirements, and how to consume the inference endpoint using Postman (software) (06:07:06).
Kserve automatically serves some standard API endpoints for the model being served, allowing users to experiment with different models without changing their code (06:10:13).
Kserve provides an Open Container Initiative container image of a model to load, allowing it to use the construct built out in communities to pull down a container image and run it, and cache it for faster scaling (06:13:05).
This approach reduces startup times, low disk space usage, enhances performance, and is compatible with Kserve infrastructure (06:14:01).
GPUs are not mandatory for serving models, smaller models can be served on CPUs, but larger models may require GPUs or a combination of GPUs and CPUs (06:16:32).
Qflow is an MLOps platform for training and fine-tuning, while Kserve is for inferencing, and they can be used independently (06:18:08).
Kserve can be used for Edge AI, where CPUs are often used, by deploying it on a local device and serving models from there (06:20:59).

Vulnerability Management and Secure-by-Design Approach in Kubernetes

Software can be thought of as a Venn diagram model, with software as the outer circle, bugs as the next level under software, and vulnerabilities as a subset of bugs that make an application vulnerable to an attacker (06:27:28).
To reduce vulnerabilities, removing unnecessary software is the first step, which can be achieved by using minimal base images that only include necessary dependencies (06:29:04).
Known vulnerabilities are a bigger threat than zero days, and defense against known vulnerabilities involves shortening the window of updating quickly when an upstream project fixes a reported vulnerability (06:33:07).
Choosing a Distro that prioritizes speed, such as Wolfi, can help update images quickly, but it's still up to the user to update often and test releases (06:33:20).
Vulnerabilities have a high cost, even if not exploited, and using CVE-clean images is important to avoid the cost of cataloging and curating vulnerabilities (06:34:16).
There is no Silver Bullet for security, and a secure-by-design approach is needed, where software is secure by default and doesn't require deviation from the default to be secure (06:35:51).
Shifting security left in the development pipeline is not enough to secure a Kubernetes environment, as it puts the burden of security on developers and can lead to false positives and unexploitable vulnerabilities (06:37:11).
The Swiss cheese (North America) security model is advocated for, which involves layering multiple defenses to make it difficult for attackers to breach the system (06:39:02).
Containers in a Kubernetes environment share a kernel, which can be a large attack surface, and segregation rather than isolation is the best approach (06:41:48).
A tool called Mi isolated has been released to help understand the difficulty of isolation with containers and educate the community on the problems around container isolation (06:43:24).
A demo is shown using a tool to apply a standard pod called leaky vessel, which has secret environment variables, and a more pernicious workload called Raider, which can access the shared kernel and expose the secret environment variables (06:44:10).
A technology provides isolation from the shared kernel, giving each pod its own kernel, and runs containers inside isolated zones for improved security (06:47:30).
The technology achieves memory isolation through a kernel per pod, separated by a hypervisor, preventing shared memory access between containers (06:53:24).
The technology differs from Cubevert, which runs virtual machines on top of Kubernetes, and Kata, which requires virtualization extensions and bare metal, by using paravirtualization to run containers with virtual machine-like isolation on standard instances (06:52:29).
The technology allows for observability and logging through tools like Fluentd, even with isolated containers, by shipping kernels for each zone and using tools like Tetragon (06:50:43).
A CTF (Capture The Flag) challenge is being offered, with a bug bounty for those who can escape the environment, to test the technology's security and gather feedback from the community (06:49:04).

Developing kubectl Plugins and CLI Tools

The Kubernetes plugin machinery was introduced in 2017 in Kubernetes 1.8, allowing developers to add commands to kubectl but not shadow existing built-in subcommands for security reasons (07:13:27).
To develop a plugin, one can create a binary or executable, put it in their path, and name it "kubectl-<plugin_name>" to invoke it as "kubectl <plugin_name>" (07:13:59).
Krew is a Command-line interface subproject that manages plugins, and it has a hand-curated list of plugins that are reviewed and maintained by the community (07:15:39).
The number of plugins on Krew has been steadily increasing, with currently 277 plugins available (07:17:51).
To develop a CLI tool that feels like kubectl, one can use reusable blocks from open source and put them together, with key aspects including CLI options, configuration and context awareness, and consistency in command organization and resource specification (07:18:19).
The Kubernetes CLI runtime provides a generic CLI options package that can be easily added to command line flags to configure how the CLI finds and talks to a cluster (07:20:51).
A Kubernetes client set can be initialized to talk to the API server, and a discovery client can be used to parse the Kubernetes API version from the server (07:22:04).
The CLI runtime resource builder can be used to query resources, and it supports various command line arguments such as "get pods" or "get services" (07:23:39).
The resource builder can also be used to query custom resources, but it requires using the unstructured client, which treats API objects as unformatted JSON objects (07:26:56).
The unstructured client can be used to build custom CLIs that work with any type of resource, but it requires writing more code to deal with the unstructured objects (07:27:37).
Cube NS or CU C config command can be used to switch into another namespace and set a default namespace for a cluster (07:28:59).
The existing code doesn't handle default namespaces, but it's not hard to implement (07:29:14).
Cube control get demo pods command can be used to print objects, and flags can be added to customize the printing functionality (07:30:58).
Server-side printing can be used to get additional information, such as pod readiness, by adding a header to the request builder (07:32:55).
Recommended resources include a blog post with good examples, and plugins like pods on con, crew, control tree, and control needs (07:33:47).
General advice for when to use a Cube cuddle plugin versus a standalone tool is to consider whether it manages generic Kubernetes resources or custom CRDs (07:36:59).

Closing Remarks and Next Steps

There is one more session in the theater. (07:38:08)
Attendees are encouraged to attend all the lightning talks after the current session. (07:38:12)

luebken/Cloud Native Rejekts NA 24 | Flex Room | Day 2.md

Cloud Native Rejekts NA 24 | Flex Room | Day 2

Collecting and Processing Logs on Kubernetes Environments Using the OpenTelemetry Collector

Database Deployment with Percona Everest

Combining WebAssembly and Unikernels for Enhanced Application Performance

Intuit's AI-Native Development Platform and Air Abstraction Layer

Runtime Detection and Response in Kubernetes with Falco, Tetragon, and CubeScape

Model Serving with Kserve

Vulnerability Management and Secure-by-Design Approach in Kubernetes

Developing kubectl Plugins and CLI Tools

Closing Remarks and Next Steps