Skip to content

Instantly share code, notes, and snippets.

@yuvalif
Created October 8, 2024 12:32
Show Gist options
  • Save yuvalif/03ba41528ed6127bd96790ccd3c1aabd to your computer and use it in GitHub Desktop.
Save yuvalif/03ba41528ed6127bd96790ccd3c1aabd to your computer and use it in GitHub Desktop.

Test

this test assumes ceph cluster with RGW is deployed via vstart

  • create the "log' bucket:
aws --endpoint-url http://localhost:8000 s3 mb s3://all-logs

Standard Mode

  • create a bucket for standard logging:
aws --endpoint-url http://localhost:8000 s3 mb s3://fish1
  • define bucket logging on the bucket:
aws --endpoint-url http://localhost:8000 s3api put-bucket-logging --bucket fish1 \
  --bucket-logging-status '{"LoggingEnabled": {"TargetBucket": "all-logs", "TargetPrefix": "fish1/"}}'
  • get the logging configuration:
aws --endpoint-url http://localhost:8000 s3api get-bucket-logging --bucket fish1

in the reply, there are defaults for the Ceph API extensions:

{
    "LoggingEnabled": {
        "TargetBucket": "all-logs",
        "TargetPrefix": "fish1/",
        "TargetObjectKeyFormat": {
            "SimplePrefix": {}
        },
        "ObjectRollTime": 300,
        "LoggingType": "Standard",
        "RecordsBatchSize": 0
    }
}
  • upload/download files to/from the bucket:
aws --endpoint-url http://localhost:8000 s3 cp myfile1 s3://fish1
aws --endpoint-url http://localhost:8000 s3 cp myfile2 s3://fish1
aws --endpoint-url http://localhost:8000 s3 cp myfile3 s3://fish1
aws --endpoint-url http://localhost:8000 s3 cp myfile4 s3://fish1
aws --endpoint-url http://localhost:8000 s3 cp myfile5 s3://fish1
aws --endpoint-url http://localhost:8000 s3 cp s3://fish1/myfile1 .
aws --endpoint-url http://localhost:8000 s3 cp s3://fish1/myfile2 .
aws --endpoint-url http://localhost:8000 s3 cp s3://fish1/myfile3 .
aws --endpoint-url http://localhost:8000 s3 cp s3://fish1/myfile4 .
aws --endpoint-url http://localhost:8000 s3 cp s3://fish1/myfile5 .
aws --endpoint-url http://localhost:8000 s3 ls s3://fish1
  • flush pending logs (default rollover is 5 minutes, and done lazily):
bin/radosgw-admin bucket logging flush --bucket fish1

to change the object roll time globaly (without API extensions) you can use the following conf parameter: rgw_bucket_logging_obj_roll_time

  • list the log objects;
aws --endpoint-url http://localhost:8000 s3api list-objects --bucket all-logs

result:

{
    "Contents": [
        {
            "Key": "fish1/2024-10-08-11-16-51-M14V9SSPFWX8YGF3",
            "LastModified": "2024-10-08T11:22:03.786000+00:00",
            "ETag": "\"99a2c3f71c17420be10e59cfdbf4e453\"",
            "Size": 1304,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "testid"
            }
        },
        {
            "Key": "fish1/2024-10-08-11-22-03-55QTC8W16UL53YK4",
            "LastModified": "2024-10-08T11:24:56.492000+00:00",
            "ETag": "\"90b7929fd4e3bebffcb4c1b7de795306\"",
            "Size": 2410,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "testid"
            }
        }
    ],
    "RequestCharged": null
}
  • view one of the log objects (the format is described here):
aws --endpoint-url http://localhost:8000 s3api get-object --bucket all-logs --key "fish1/2024-10-08-11-16-51-M14V9SSPFWX8YGF3" tmp && cat tmp

result:

testid fish1 [08/Oct/2024:11:16:49 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.17136269003076627006 REST.GET.get_bucket_logging - "GET /fish1?logging HTTP/1.1" 200 - - - - 1994ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:21:25 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.15049958399203899195 REST.PUT.put_obj myfile1 "PUT /fish1/myfile1 HTTP/1.1" 200 - 512 512 - 5ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:21:29 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.3529716976912471501 REST.PUT.put_obj myfile2 "PUT /fish1/myfile2 HTTP/1.1" 200 - 512 512 - 4ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:21:33 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.8597351677924501218 REST.PUT.put_obj myfile3 "PUT /fish1/myfile3 HTTP/1.1" 200 - 512 512 - 4ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:21:37 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.17002941671521503667 REST.PUT.put_obj myfile4 "PUT /fish1/myfile4 HTTP/1.1" 200 - 512 512 - 4ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:21:41 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.10270826524102175728 REST.PUT.put_obj myfile5 "PUT /fish1/myfile5 HTTP/1.1" 200 - 512 512 - 4ms - - - - - - - localhost - -

and the other:

aws --endpoint-url http://localhost:8000 s3api get-object --bucket all-logs --key "fish1/2024-10-08-11-22-03-55QTC8W16UL53YK4" tmp && cat tmp

result:

testid fish1 [08/Oct/2024:11:22:03 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.1163679657773750377 REST.HEAD.get_obj myfile1 "HEAD /fish1/myfile1 HTTP/1.1" 200 - - 512 - 6ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:03 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.13451860939191171703 REST.GET.get_obj myfile1 "GET /fish1/myfile1 HTTP/1.1" 200 - - 512 - 1ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:06 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.10501611414128746418 REST.HEAD.get_obj myfile3 "HEAD /fish1/myfile3 HTTP/1.1" 200 - - 512 - 1ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:06 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.10933534561728825757 REST.GET.get_obj myfile3 "GET /fish1/myfile3 HTTP/1.1" 200 - - 512 - 1ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:09 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.1145944454414059344 REST.HEAD.get_obj myfile2 "HEAD /fish1/myfile2 HTTP/1.1" 200 - - 512 - 2ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:09 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.2386762757964382807 REST.GET.get_obj myfile2 "GET /fish1/myfile2 HTTP/1.1" 200 - - 512 - 1ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:12 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.17714248039542601710 REST.HEAD.get_obj myfile4 "HEAD /fish1/myfile4 HTTP/1.1" 200 - - 512 - 1ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:12 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.15806212711924203784 REST.GET.get_obj myfile4 "GET /fish1/myfile4 HTTP/1.1" 200 - - 512 - 1ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:16 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.7693100589051267205 REST.HEAD.get_obj myfile5 "HEAD /fish1/myfile5 HTTP/1.1" 200 - - 512 - 2ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:16 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.11895882796410803782 REST.GET.get_obj myfile5 "GET /fish1/myfile5 HTTP/1.1" 200 - - 512 - 1ms - - - - - - - localhost - -
testid fish1 [08/Oct/2024:11:22:29 +0000] - testid 03070348-111e-41eb-b1fc-8de7a1fed8e2.4179.2373949767378283374 REST.GET.list_bucket - "GET /fish1?list-type=2&prefix=&delimiter=%2F&encoding-type=url HTTP/1.1" 200 - - - - 4ms - - - - - - - localhost - -

note that the "operation" field does not follow the same names as defined by AWS. e.g. REST.PUT.OBJECT in AWS is REST.PUT.put_obj in Ceph)

Journal mode

  • to enable our extension to the API when using python (boto3 or aws CLI) the following file has to be placed under: ~/.aws/models/s3/2006-03-01/ (the directory should be created if it dioes not exist)
  • currently there is no generic solution for other client SDKs
  • create a bucket for journal logging:
aws --endpoint-url http://localhost:8000 s3 mb s3://fish2
  • and create the logging configuration:
aws --endpoint-url http://localhost:8000 s3api put-bucket-logging --bucket fish2 \
  --bucket-logging-status '{"LoggingEnabled": {"TargetBucket": "all-logs", "TargetPrefix": "fish2/", "ObjectRollTime": 5, "LoggingType": "Journal"}}'
  • get the logging configuration:
aws --endpoint-url http://localhost:8000 s3api get-bucket-logging --bucket fish

the reply:

{
    "LoggingEnabled": {
        "TargetBucket": "all-logs",
        "TargetPrefix": "fish2/",
        "TargetObjectKeyFormat": {
            "SimplePrefix": {}
        },
        "ObjectRollTime": 5,
        "LoggingType": "Journal",
        "RecordsBatchSize": 0
    }
}
  • upload/download files to/from the bucket:
aws --endpoint-url http://localhost:8000 s3 cp myfile1 s3://fish2
aws --endpoint-url http://localhost:8000 s3 cp myfile2 s3://fish2
aws --endpoint-url http://localhost:8000 s3 cp myfile3 s3://fish2
aws --endpoint-url http://localhost:8000 s3 cp myfile4 s3://fish2
aws --endpoint-url http://localhost:8000 s3 cp myfile5 s3://fish2
aws --endpoint-url http://localhost:8000 s3 cp s3://fish2/myfile1 .
aws --endpoint-url http://localhost:8000 s3 cp s3://fish2/myfile2 .
aws --endpoint-url http://localhost:8000 s3 cp s3://fish2/myfile3 .
aws --endpoint-url http://localhost:8000 s3 cp s3://fish2/myfile4 .
aws --endpoint-url http://localhost:8000 s3 cp s3://fish2/myfile5 .
aws --endpoint-url http://localhost:8000 s3 ls s3://fish2
  • flush pending logs (rollover is set to 5 seconds, but still done lazily):
bin/radosgw-admin bucket logging flush --bucket fish2
  • list the log objects;
aws --endpoint-url http://localhost:8000 s3api list-objects --bucket all-logs

result (distinguish between logs of the different source buckets according to prefix):

{
    "Contents": [
        {
            "Key": "fish1/2024-10-08-11-16-51-M14V9SSPFWX8YGF3",
            "LastModified": "2024-10-08T11:22:03.786000+00:00",
            "ETag": "\"99a2c3f71c17420be10e59cfdbf4e453\"",
            "Size": 1304,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "testid"
            }
        },
        {
            "Key": "fish1/2024-10-08-11-22-03-55QTC8W16UL53YK4",
            "LastModified": "2024-10-08T11:24:56.492000+00:00",
            "ETag": "\"90b7929fd4e3bebffcb4c1b7de795306\"",
            "Size": 2410,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "testid"
            }
        },
        {
            "Key": "fish2/2024-10-08-12-16-24-DA7LAKNUO909QGRX",
            "LastModified": "2024-10-08T12:18:15.753000+00:00",
            "ETag": "\"5bba576297756b1d724bad714905c60c\"",
            "Size": 520,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "testid"
            }
        },
        {
            "Key": "fish2/2024-10-08-12-18-56-NQ6QRM9S85IKXPYR",
            "LastModified": "2024-10-08T12:19:10.802000+00:00",
            "ETag": "\"85ca28e5d16df62caf75ed94f4b4974a\"",
            "Size": 520,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "testid"
            }
        }
    ],
    "RequestCharged": null
}
  • view one of the log objects (the format is described here):
aws --endpoint-url http://localhost:8000 s3api get-object --bucket all-logs --key "fish2/2024-10-08-12-18-56-NQ6QRM9S85IKXPYR" tmp && cat tmp

result (GET, HEAD and LIST are not logged):

testid fish2 [08/Oct/2024:12:18:56 +0000] myfile1 REST.PUT.put_obj 512 300218cb9a99704c77b962119ca9a7c4
testid fish2 [08/Oct/2024:12:18:56 +0000] myfile2 REST.PUT.put_obj 512 318d5d29d89dac8378694b6f846f41f6
testid fish2 [08/Oct/2024:12:18:57 +0000] myfile3 REST.PUT.put_obj 512 9164ee3c6963eae380457235a0bae298
testid fish2 [08/Oct/2024:12:18:57 +0000] myfile4 REST.PUT.put_obj 512 e1a1079623ab9a248f7322efc893944f
testid fish2 [08/Oct/2024:12:18:58 +0000] myfile5 REST.PUT.put_obj 512 b83159995bc2858bf3b237e25ed800d6

TODO

P1

  • authorization/ownership model
    • currently we set the oener of the log bucket to be the owner of all log objects
    • not support for TargetGrants
    • automatic read policies to log bucket
  • standard operation names
  • versioned buckets support in journal mode
  • flush REST API

P2

  • missing frontend data in standard mode
  • in-memory buffered logs in standard mode for better performence (e.g .when logging GET/HEAD operations)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment