Yuval Lifshitz yuvalif

Setup

start a vstart cluster with RGW

Alternative 1

Add object locking to all bucket creations via a lua.

upload the following script in prerequest context:

-- enablog object lock on bucket creation

copied from: https://claude.ai/share/e4bed98a-9049-44b3-9aee-173bba941120

When a Kafka producer sets partitions explicitly, there are several important trade-offs to consider:

Pros of Explicit Partition Assignment

Guaranteed Message Ordering: Messages sent to the same partition are guaranteed to maintain their order. This is crucial for use cases where sequence matters, like financial transactions or event sourcing.
Predictable Data Locality: You can ensure related messages always go to the same partition, which helps with consumer processing efficiency and stateful operations.
Load Distribution Control: You have fine-grained control over how messages are distributed across partitions, allowing you to optimize for your specific access patterns.
Deterministic Behavior: Your application's behavior becomes more predictable since you know exactly where each message will land.

machine

 lsblk
NAME           MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda              8:0    0 893.8G  0 disk
└─sda1           8:1    0 893.7G  0 part /
nvme1n1        259:0    0   1.5T  0 disk
nvme0n1        259:1    0   1.5T  0 disk
├─vg_nvme-lv_1 253:0    0  89.4G  0 lvm
├─vg_nvme-lv_2 253:1    0  89.4G  0 lvm

Basic Bucket Logging Testing

to enable our extension to the API when using python (boto3 or aws CLI) the following file has to be placed under: ~/.aws/models/s3/2006-03-01/ (the directory should be created if it dioes not exist)

currently there is no generic solution for other client SDKs

start a vstart cluster
create a bucket:

aws --endpoint-url http://localhost:8000 s3 mb s3://fish

create a log bucket:

how much time did this refasctoring work took?

_ Looking at the timestamps in our conversation, the refactoring work took approximately 45-50 minutes from start to finish.

Timeline Breakdown:

Phase 1: Analysis & Planning (5-10 minutes)

Understanding the codebase structure
Identifying the RGWOp class hierarchy
Finding all canonical_name() overrides

this is a followup on: https://gist.github.com/yuvalif/d50bcc3d27121732ffbbafe7b7146112 for an RGW of version "squid" or above, where bucket notifications and topics are synced between zones

deploy a zonegroup with 2 zones:

MON=1 OSD=1 MDS=0 MGR=0 ../src/test/rgw/test-rgw-multisite.sh 2

export credentials:

export AWS_ACCESS_KEY_ID=1234567890

start a vstart cluster
created a tenanted user:

bin/radosgw-admin user create --display-name "Ka Boom" --tenant boom --uid ka --access_key ka --secret_key boom

create a bucket on that tenant

AWS_ACCESS_KEY_ID=ka AWS_SECRET_ACCESS_KEY=boom aws --endpoint-url http://localhost:8000 s3 mb s3://fish

create a log bucket with no tenant

Warm and Fuzzy

Background

The RGW's frontend is an S3 REST API server, and in this project we would like to use a REST API fuzzer to test the RGW for security issues (and other bugs). Would recommend exploring the Restler tool. Very good intro in this video. Feed it with the AWS S3 OpenAPI spec, and see what happens when we let it connect to the RGW.

Project

Initial (evaluation) Phase

run Ceph with a radosgw. you can use cephadm to install and run ceph in containers or build it from source and run it a vstart cluster

	-- Lua script to auto-tier S3 object PUT requests
	-- based on this: https://ceph.io/en/news/blog/2024/auto-tiering-ceph-object-storage-part-2/

	-- exit script quickly if it is not a PUT request
	if Request == nil or Request.RGWOp ~= "put_obj" then
	return
	end

	local threshold = 1024*1024 -- 1MB
	local debug = true