Skip to content

Instantly share code, notes, and snippets.

-- calculate entropy of an object
function object_entropy(full_name)
local byte_hist = {}
local byte_hist_size = 256
for i = 1,byte_hist_size do
byte_hist[i] = 0
end
local total = 0
# assuming we are running a vstart cluster
import subprocess
import time
import sys
def system(cmd):
output = subprocess.check_output(cmd, shell=True).decode(sys.stdout.encoding).strip()
return output
prev_rgw_sum = 0

Goal

The goal of this setup is to overload a single RGW so that adding another one would increase the throughput without overloaifng the OSDs.

Setup

  • machine with multiple nvme drives and enough CPU/RAM to run both Ceph and the clients. e.g.
$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda           8:0    0 893.8G  0 disk 
└─sda1        8:1    0 893.8G  0 part /

start cluster:

MON=1 OSD=1 MDS=0 MGR=0 RGW=1 ../src/vstart.sh -n -d 

start HTTP endpoint:

wget https://gist.githubusercontent.com/mdonkers/63e115cc0c79b4f6b8b3a6b797e485c7/raw/a6a1d090ac8549dac8f2bd607bd64925de997d40/server.py
python server.py 10900

Delivery Guarantee

Due to the batching, there is a situation where messages reside inside the converter. The MQTT protocol does not allow for end 2 end acknowledgments, meaning that once the messages arrive at the converter, they are considered as “delivered”. Therefore, if the converter fails, the messages that were not yet uploaded into the S3 object are going to be lost. To make sure that delivery is guaranteed, we would need a mechanism that makes sure that the messages are not lost if the converter crashes while waiting for a batch to fill.

One option fo that would be to write every message to persistent media (e.g. disk) as it arrives. If a process restarts, it would read that file and send the data in it. However, this would have 2 main drawbacks:

  • there will be a significant performance cost
  • since mesages are automatically acked when received, everything in th disk buffer or that was not written yet will be lost on crash

Prerequisites

TODO

Create the VM

Use the "system" session by defaul:

export LIBVIRT_DEFAULT_URI="qemu:///system"

Create parameters for the setup:

Goal

The goal here is to run multisite tests on a single node at scale (assuming the node has enough capacity). This is done using vstart/mstart so the development-test cycle is faster.

Prerequisites

cmake -DBOOST_J=$(nproc) -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DWITH_MGR_DASHBOARD_FRONTEND=OFF -DWITH_SEASTAR=OFF \
-DWITH_DPDK=OFF -DWITH_SPDK=OFF -DWITH_CEPHFS=OFF -DWITH_RBD=OFF -DWITH_KRBD=OFF -DWITH_CCACHE=OFF \
-DWITH_MANPAGE=OFF -DWITH_LTTNG=OFF -DWITH_BABELTRACE=OFF -DWITH_SYSTEM_BOOST=OFF -DWITH_BOOST_VALGRIND=ON \

setup

start:

MON=1 OSD=1 MDS=0 MGR=0 ../src/test/rgw/test-rgw-multisite.sh 2 --rgw_max_objs_per_shard=50 \
    --rgw_reshard_thread_interval=60 --rgw_user_quota_bucket_sync_interval=90 \
    --rgw_data_notify_interval_msec=0 --rgw_sync_log_trim_interval=0
  • rgw_max_objs_per_shard: make sure reshard is happening often
  • rgw_reshard_thread_interval: make sure reshard happens more than once within the time of the test (200s)
  • rgw_data_notify_interval_msec: disable out-of-band sync triggers (so they won't hide any issues)

Goal

This document add more details on the GSoC22 project "Telescópio Lua".

Ceph is a distributed storage system that supports: block, file, and object storage. All types of storage use the RADOS backend storage system. S3 compliant object storage is provided by the Object Gateway (a.k.a. the RADOS Gateway or the RGW).

In this project we should: Expose the payload of the objects being uploaded (PUT) or retrieved (GET) as a stream of bytes to Lua in the RGW.

  • The Lua script should be able to read the payload and perform calculation on the payload and use the outcome. Decisions could be made based on it, it would be written to object attributes, logged, or sent to external systems.
  • The Lua script should be able to rewrite the payload being uploaded (PUT) or retrieved (GET)

Goal

Ceph is a distributed storage system that supports: block, file, and object storage. All types of storage use the RADOS backend storage system. S3 compliant object storage is provided by the Object Gateway (a.k.a. the RADOS Gateway or the RGW). Since we are S3 compliant, clients can connect to the RGW using standard client libraries provided by AWS. However, our bucket notification offering extends the functionality offerent by AWS. We have several examples of how to hack the standard AWS clients to use our extended bucket notifications APIs. Currently, we have such examples for python (using the boto3 library) - however, we need to keep them up to date with the recent changes in our code we a