Skip to content

Instantly share code, notes, and snippets.

@yuvalif
Created March 9, 2022 09:42
Show Gist options
  • Select an option

  • Save yuvalif/7120c91c4592135defe5cce63b818d82 to your computer and use it in GitHub Desktop.

Select an option

Save yuvalif/7120c91c4592135defe5cce63b818d82 to your computer and use it in GitHub Desktop.

Goal

This document add more details on the GSoC22 project "Telescópio Lua".

Ceph is a distributed storage system that supports: block, file, and object storage. All types of storage use the RADOS backend storage system. S3 compliant object storage is provided by the Object Gateway (a.k.a. the RADOS Gateway or the RGW).

In this project we should: Expose the payload of the objects being uploaded (PUT) or retrieved (GET) as a stream of bytes to Lua in the RGW.

  • The Lua script should be able to read the payload and perform calculation on the payload and use the outcome. Decisions could be made based on it, it would be written to object attributes, logged, or sent to external systems.
  • The Lua script should be able to rewrite the payload being uploaded (PUT) or retrieved (GET)

Note that in case of large objects, only part of the payload is exposed to Lua each time a request is handled. Storing data and handling large objects is not in scope here

Evaluation Period

Try out Ceph

Install Linux

First would be to have a linux based development environment, as a minimum you would need a 8 CPU machine, with 16G RAM and 50GB disk.

Note that using a machine with lower spec is also possible, but Ceph build time might take several hours

Unless you already have a linux distro you like, I would recommend choosing from:

  • Fedora - my favorite (34 or higher)
  • Ubuntu (20.04 and up)
  • OpenSuse (Leap 15.2 or tumbleweed)

Using WSL on your Windows machine is also possible, but build times would be longer than running native Linux

Git

Once you have that up and running, you should clone the Ceph repo from github (https://github.com/ceph/ceph). If you don’t know what github and git are, this is the right time to close these gaps :-) And yes, you should have a github account, so you can later share your work on the project.

Build

The repo has a readme file with instructions on how to build ceph - just follow these instructions and and build it (depending with the amount of CPUs you have this may take a while). Our build system is based on cmake - so it is probably a good idea to know a little bit about that. Assuming the build was completed successfully, you can run the unit tests (see: https://github.com/ceph/ceph#running-unit-tests).

Try the RGW

Now you are ready to run the ceph processes, as explained here: https://github.com/ceph/ceph#running-a-test-cluster You probably would also like to check the developer guide (https://docs.ceph.com/docs/master/dev/developer_guide/) and learn more on how to build Ceph and run it locally (https://docs.ceph.com/docs/master/dev/quick_guide/).

Assuming you have everything up and running, you can create a bucket in Ceph and upload an object to it. Best way for doing that is the s3cmd python command line tool: https://github.com/s3tools/s3cmd Note that the tool is mainly geared towards AWS S3, so make sure to specify the location of the RGW as the endpoint, and the RGW credentials (as printed to the screen after running vstart.sh).

For example:

$ s3cmd --host=localhost:8000 --host-bucket="localhost:8000/%(bucket)" \
--access_key=0555b35654ad1656d804 \
--secret_key=h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== \
mb s3://mybucket

Would create a bucket called mybucket in Ceph. And:

$ s3cmd --host=localhost:8000 --host-bucket="localhost:8000/%(bucket)" \
--access_key=0555b35654ad1656d804 \
--secret_key=h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== \
put myimage.jpg s3://mybucket

Would put myimage.jpg into that bucket.

Try Lua Scripting on the RGW

Lua is not a commonly used language, on the other hand, it is very intuitive abd easy to learn. To get started, the best option is to use the free online version of "Programming in Lua" (PIL).

Note that this is a guide for Lua 5.0, while we are using Lua 5.3 - but for the basic stuff it should not matter much.

Reference manual exists for the latest Lua.

More information on learning Lua could be found in the Lua Users Wiki.

Please pick one of the code examples from here to test Lua scripting on the RGW.

Possible Lua Related Contributions

Would recommend to try and contribute to Ceph project based on this PR:

  • rewrite the Prometheus example to send the counters from the background context instead of on every request
  • use Intel's TBB concurrent hash map to replace the current mutex based implementation for the background hash map

Other Lua related contributions could be:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment