Ceph is a distributed storage system that supports: block, file, and object storage. All types of storage use the RADOS backend storage system. S3 compliant object storage is provided by the Object Gateway (a.k.a. the RADOS Gateway or the RGW). Since we are S3 compliant, clients can connect to the RGW using standard client libraries provided by AWS. However, our bucket notification offering extends the functionality offerent by AWS. We have several examples of how to hack the standard AWS clients to use our extended bucket notifications APIs. Currently, we have such examples for python (using the boto3 library) - however, we need to keep them up to date with the recent changes in our code we are missing an example of how to hack the golang/java AWS SDK for the same purpose.
In this project we should:
- Python: make sure that all of the python (boto3 based) examples are up-to-date with the code
- Golang: extend the golang AWS SDK to include our bucket notification extensions. This was partially done as part of the Rook project but should be completed and moved to our go ceph library
- Java: similarly it should be done in the AWS Java SDK. This should be added as a new subdirectory in the ceph examples directory
- Documentation: Ceph's S3 client documentation should be updated
First, would be to have a Linux based development environment, as a minimum you would need an 8 CPU machine, with 16G RAM and 50GB disk.
Note that using a machine with a lower spec is also possible, but Ceph build time might take several hours
Unless you already have a Linux distro you like, I would recommend choosing from:
- Fedora - my favorite (34 or higher)
- Ubuntu (20.04 and up)
- OpenSuse (Leap 15.2 or tumbleweed)
Using WSL on your Windows machine is also possible, but build times would be longer than running native Linux
Once you have that up and running, you should clone the Ceph repo from Github (https://github.com/ceph/ceph). If you don’t know what Github and git are, this is the right time to close these gaps :-) And yes, you should have a Github account, so you can later share your work on the project.
The repo has a readme file with instructions on how to build ceph - just follow these instructions and build it (depending on the amount of CPUs you have this may take a while).
Our build system is based on cmake
- so it is probably a good idea to know a little bit about that.
Assuming the build was completed successfully, you can run the unit tests (see: https://github.com/ceph/ceph#running-unit-tests).
Now you are ready to run the ceph processes, as explained here: https://github.com/ceph/ceph#running-a-test-cluster
You probably would also like to check the developer guide (https://docs.ceph.com/docs/master/dev/developer_guide/) and learn more on how to build Ceph and run it locally (https://docs.ceph.com/docs/master/dev/quick_guide/).
Would recommend running vstrat as following (from inside the build
directory):
MON=1 OSD=1 MDS=0 MGR=0 RGW=1 ../src/vstart.sh -n -d
Assuming you have everything up and running, you can create a bucket in Ceph and upload an object to it.
The best way for doing that is the s3cmd
python command-line tool:
https://github.com/s3tools/s3cmd
Note that the tool is mainly geared towards AWS S3, so make sure to specify the location of the RGW as the endpoint, and the RGW credentials (as printed to the screen after running vstart.sh
).
For example:
$ s3cmd --no-ssl --host=localhost:8000 --host-bucket="localhost:8000/%(bucket)" \
--access_key=0555b35654ad1656d804 \
--secret_key=h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== \
mb s3://mybucket
Would create a bucket called mybucket
in Ceph.
And:
$ s3cmd --no-ssl --host=localhost:8000 --host-bucket="localhost:8000/%(bucket)" \
--access_key=0555b35654ad1656d804 \
--secret_key=h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== \
put myimage.jpg s3://mybucket
Would put myimage.jpg
into that bucket.
Currently bucket notifications support sending messages to HTTP, AMQP and Kafka endpoints. To test bucket notifications over HTTP use:
- assuming the local vstart cluster is already running
- setup AWS CLI with RGW extensions according to this (note that this was tested only with aws cli v1). The
s3cmd
tool used above, does not have the notification related APIs, so we should use the AWS CLI tool here - start an HTTP server (which accepts POST messages) on port 10900 (or any other port, but then change the topic configuration below accordingly). you can use the following python server (note that you would need python 3 for this server):
wget https://gist.githubusercontent.com/mdonkers/63e115cc0c79b4f6b8b3a6b797e485c7/raw/a6a1d090ac8549dac8f2bd607bd64925de997d40/server.py
python server.py 10900
- create a bucket notification topic with the above HTTP server as its endpoint:
aws --endpoint-url http://localhost:8000 sns create-topic --name=fishtopic \
--attributes='{"push-endpoint": "http://localhost:10900"}'
- create a bucket:
aws --endpoint-url http://localhost:8000 s3 mb s3://fish
- create a bucket notification tying together the above topic and bucket:
aws --region=default --endpoint-url http://localhost:8000 s3api put-bucket-notification-configuration \
--bucket fish \
--notification-configuration='{"TopicConfigurations": [{"Id": "notif1", "TopicArn": "arn:aws:sns:default::fishtopic", "Events": []}]}'
- create and upload a file to the bucket and make sure that the HTTP server got the notification
head -c 1024 </dev/urandom > myfile
aws --endpoint-url http://localhost:8000 s3 cp myfile s3://fish
Write a small golang app that uploads an object using the AWS golang SDK. Code should be submitted as a PR that:
- add a directory called "golang" under ceph examples
- add your code under that directory
- add a README.md file that hold full instruction on how to build and run your example againts a vstart cluster
Write a small Java app that uploads an object using the AWS Java SDK. Code should be submitted as a PR that:
- add a directory called "java" under ceph examples
- add your code under that directory
- add a README.md file that hold full instruction on how to build and run your example againts a vstart cluster
Thank you. This cleared all confusion.