Skip to content

Instantly share code, notes, and snippets.

@yuvalif
Last active April 28, 2022 19:28
Show Gist options
  • Save yuvalif/e1766b75594a45dcdea8717bcc6f4525 to your computer and use it in GitHub Desktop.
Save yuvalif/e1766b75594a45dcdea8717bcc6f4525 to your computer and use it in GitHub Desktop.

Goal

Ceph is a distributed storage system that supports: block, file, and object storage. All types of storage use the RADOS backend storage system. S3 compliant object storage is provided by the Object Gateway (a.k.a. the RADOS Gateway or the RGW). Since we are S3 compliant, clients can connect to the RGW using standard client libraries provided by AWS. However, our bucket notification offering extends the functionality offerent by AWS. We have several examples of how to hack the standard AWS clients to use our extended bucket notifications APIs. Currently, we have such examples for python (using the boto3 library) - however, we need to keep them up to date with the recent changes in our code we are missing an example of how to hack the golang/java AWS SDK for the same purpose.

In this project we should:

  1. Python: make sure that all of the python (boto3 based) examples are up-to-date with the code
  2. Golang: extend the golang AWS SDK to include our bucket notification extensions. This was partially done as part of the Rook project but should be completed and moved to our go ceph library
  3. Java: similarly it should be done in the AWS Java SDK. This should be added as a new subdirectory in the ceph examples directory
  4. Documentation: Ceph's S3 client documentation should be updated

Evaluation Period

Try out Ceph

Install Linux

First, would be to have a Linux based development environment, as a minimum you would need an 8 CPU machine, with 16G RAM and 50GB disk.

Note that using a machine with a lower spec is also possible, but Ceph build time might take several hours

Unless you already have a Linux distro you like, I would recommend choosing from:

  • Fedora - my favorite (34 or higher)
  • Ubuntu (20.04 and up)
  • OpenSuse (Leap 15.2 or tumbleweed)

Using WSL on your Windows machine is also possible, but build times would be longer than running native Linux

Git

Once you have that up and running, you should clone the Ceph repo from Github (https://github.com/ceph/ceph). If you don’t know what Github and git are, this is the right time to close these gaps :-) And yes, you should have a Github account, so you can later share your work on the project.

Build

The repo has a readme file with instructions on how to build ceph - just follow these instructions and build it (depending on the amount of CPUs you have this may take a while). Our build system is based on cmake - so it is probably a good idea to know a little bit about that. Assuming the build was completed successfully, you can run the unit tests (see: https://github.com/ceph/ceph#running-unit-tests).

Try the RGW

Now you are ready to run the ceph processes, as explained here: https://github.com/ceph/ceph#running-a-test-cluster You probably would also like to check the developer guide (https://docs.ceph.com/docs/master/dev/developer_guide/) and learn more on how to build Ceph and run it locally (https://docs.ceph.com/docs/master/dev/quick_guide/). Would recommend running vstrat as following (from inside the build directory):

MON=1 OSD=1 MDS=0 MGR=0 RGW=1 ../src/vstart.sh -n -d

Assuming you have everything up and running, you can create a bucket in Ceph and upload an object to it. The best way for doing that is the s3cmd python command-line tool: https://github.com/s3tools/s3cmd Note that the tool is mainly geared towards AWS S3, so make sure to specify the location of the RGW as the endpoint, and the RGW credentials (as printed to the screen after running vstart.sh).

For example:

$ s3cmd --no-ssl --host=localhost:8000 --host-bucket="localhost:8000/%(bucket)" \
--access_key=0555b35654ad1656d804 \
--secret_key=h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== \
mb s3://mybucket

Would create a bucket called mybucket in Ceph. And:

$ s3cmd --no-ssl --host=localhost:8000 --host-bucket="localhost:8000/%(bucket)" \
--access_key=0555b35654ad1656d804 \
--secret_key=h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== \
put myimage.jpg s3://mybucket

Would put myimage.jpg into that bucket.

Try Bucket Notifications

Currently bucket notifications support sending messages to HTTP, AMQP and Kafka endpoints. To test bucket notifications over HTTP use:

  • assuming the local vstart cluster is already running
  • setup AWS CLI with RGW extensions according to this (note that this was tested only with aws cli v1). The s3cmd tool used above, does not have the notification related APIs, so we should use the AWS CLI tool here
  • start an HTTP server (which accepts POST messages) on port 10900 (or any other port, but then change the topic configuration below accordingly). you can use the following python server (note that you would need python 3 for this server):
wget https://gist.githubusercontent.com/mdonkers/63e115cc0c79b4f6b8b3a6b797e485c7/raw/a6a1d090ac8549dac8f2bd607bd64925de997d40/server.py
python server.py 10900
  • create a bucket notification topic with the above HTTP server as its endpoint:
aws --endpoint-url http://localhost:8000 sns create-topic --name=fishtopic \
  --attributes='{"push-endpoint": "http://localhost:10900"}'
  • create a bucket:
aws --endpoint-url http://localhost:8000 s3 mb s3://fish
  • create a bucket notification tying together the above topic and bucket:
aws --region=default --endpoint-url http://localhost:8000 s3api put-bucket-notification-configuration \
  --bucket fish \
  --notification-configuration='{"TopicConfigurations": [{"Id": "notif1", "TopicArn": "arn:aws:sns:default::fishtopic", "Events": []}]}'
  • create and upload a file to the bucket and make sure that the HTTP server got the notification
head -c 1024 </dev/urandom > myfile
aws --endpoint-url http://localhost:8000 s3 cp myfile s3://fish

Try the AWS Golang Client

Write a small golang app that uploads an object using the AWS golang SDK. Code should be submitted as a PR that:

  • add a directory called "golang" under ceph examples
  • add your code under that directory
  • add a README.md file that hold full instruction on how to build and run your example againts a vstart cluster

Try the AWS Java Client

Write a small Java app that uploads an object using the AWS Java SDK. Code should be submitted as a PR that:

  • add a directory called "java" under ceph examples
  • add your code under that directory
  • add a README.md file that hold full instruction on how to build and run your example againts a vstart cluster
@Ochuwa-sophie
Copy link

Ok, thanks.

@yuvalif I am quite confused at this statement Golang: extend the [golang AWS SDK](https://github.com/aws/aws-sdk-go) to include our bucket notification extensions. This was partially done as part of the [Rook project](https://github.com/rook/rook/blob/master/pkg/operator/ceph/object/notification/s3ext.go) but should be completed and moved to our [go ceph](https://github.com/ceph/go-ceph) library.
Are we not supposed to make ceph-go support putBucketNotification instead, because aws-golang-sdk already supports putBucketNotification?

Heyy, I am also applying to ceph but I will like to make a guess. I think what Yuvalif means is that we should actually make the Golang aws sdk support putBucketNotification.

please see: https://gist.github.com/yuvalif/e1766b75594a45dcdea8717bcc6f4525?permalink_comment_id=4142747#gistcomment-4142747

@miracle1504
Copy link

@yuvalif I am quite confused at this statement Golang: extend the [golang AWS SDK](https://github.com/aws/aws-sdk-go) to include our bucket notification extensions. This was partially done as part of the [Rook project](https://github.com/rook/rook/blob/master/pkg/operator/ceph/object/notification/s3ext.go) but should be completed and moved to our [go ceph](https://github.com/ceph/go-ceph) library.
Are we not supposed to make ceph-go support putBucketNotification instead, because aws-golang-sdk already supports putBucketNotification?

they already implemented it, but are missing some extensions that were added to the API as part of Ceph note that this is the actual project, and not part of the "evaluation period"

Ok, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment