Tidy Up Song

Background

Static analysis is a powerful tool when searching for code issues. It does, however, have its costs. Careful tuning is required so that the developers are not overwhelmed with "false positive" detections, while on the other hand - critical errors are not missed. It also requires simple integration into the development flow, so that new issues are continuously detected and addressed.

In this project we use clang-tidy for automated static analysis in the Ceph project.

clang-tidy - a component of the Clang toolchain (that is already supported by Ceph)- is a common open-source 'linter'.

clang-tidy is not a "static analyzer" per se, but close enough - and is actually wider in scope. It is highly tunable and easy to integrate.

(For a discussion of various tools that come to mind when analyzing C++ code, and what separates clang-tidy from Clang's static analyzer tool - https://www.linkedin.com/pulse/its-2020-time-touch-base-static-analysis-maurizio-martignano by Maurizio Martigano is an interesting read).

Evaluation Stage

Step 1 - Build Ceph and Run Basic Test

Start with would be to have access to a linux based development environment; a 4-CPU machine with 16G RAM and 100GB disk is a minimal requirement.

Unless you already have a linux distro you prefer, I would recommend choosing from:

Fedora (38/39) - my favorite!
Ubuntu (22.04 LTS)
WSL (Windows Subsystem for Linux)
Other Linux distros - try at your own risk :-)

Once you have that up and running, you should clone the Ceph repo from github (https://github.com/ceph/ceph). If you don’t know what github and git are, this is the right time to close these gaps :-) And yes, you should have a github account, so you can later share your work on the project.

Install any missing system dependencies using:

./install-deps.sh

Please note that the first build may take a considerable amount time, so the following cmake parameter could be used to minimize the build time. With a fresh ceph clone use the following:

./do_cmake.sh -DWITH_SEASTAR=OFF -DCMAKE_C_COMPILER="clang" -DCMAKE_CXX_COMPILER="clang++" -DCMAKE_CXX_FLAGS:STRING="-std=gnu++20" -DALLOCATOR=tcmalloc -DWITH_LTTNG=OFF -DWITH_RDMA=OFF -DWITH_RADOSGW_LUA_PACKAGES=ON -DWITH_MANPAGE=OFF -DWITH_CCACHE=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_EXPORT_COMPILE_COMMANDS:BOOL=ON -DALLOCATOR=tcmalloc -G Ninja -DBOOST_J=$(nproc) -DWITH_MGR_DASHBOARD_FRONTEND=OFF -DWITH_DPDK=OFF -DWITH_SPDK=OFF -DWITH_RBD=OFF -DWITH_KRBD=OFF -DWITH_RADOSGW_SELECT_PARQUET=OFF -DWITH_PYTHON3=3.12 -DWITH_SYSTEM_QATLIB=ON -DWITH_SYSTEM_QATZIP=ON

(For Fedora 38: use WITH_PYTHON3=3.11)

if the build directory already exists, you can rebuild the ninja files by using (from build):

cmake -DBOOST_J=$(nproc) ......

Once do_cmake.sh completes, move to the newly created 'build' directory, and initiate the build process:

ninja

Assuming the build completes successfully, you can run the unit tests (see: https://github.com/ceph/ceph#running-unit-tests).

Now you are ready to run the ceph processes, as explained here: https://github.com/ceph/ceph#running-a-test-cluster You probably would also like to check the developer guide (https://docs.ceph.com/docs/master/dev/developer_guide/) and learn more on how to build Ceph and run it locally (https://docs.ceph.com/docs/master/dev/quick_guide/).

For this qualification step: Run a specific standalone test, and copy the results. Use the following shell line:

SFS="/tmp/tst_`date +'%d_%T'`"; echo $SFS;  date; time ../qa/run-standalone.sh -x -v "osd-scrub-snaps.sh" 2>&1 | awk '{ print strftime(),$0 }' > $SFS; tail $SFS

The output of the 'tail' command includes a success/failure message for the test (which is expected to succeed). Share the contents of the $SFS file with us.

Step 2 - excercising problem detection & fixing using clang-tidy

Run a small subset of clang-tidy checks on a subset of Ceph source files.

The source files for the OSD - one of Ceph nodes - are located in ceph/src/osd. We would suggest starting your tests with a meaningful subset of these files. That subset may contain OSD.* + PG* + PrimaryLog*.
Make sure there is a relevant compile_commands.json file. compile_commands.json (which should have been created by the build process) details the command line to be used when compiling each source file. It also serves clang-tidy in parsing the source file.
Select a very small set of checks. The checks performed should not be about style or specific style-guides: use a small set of bug-related or performance-related clang-tidy checks.
Identify a few actual problems in the generated report.
Issue a fix for at least one of these problems (including - create a PR that describes the problem and the fix).

Note: at this stage - do not report the problems detected to the Ceph bug list.

($) the source files for the OSD - one of Ceph nodes - are located in ceph/src/osd. We would suggest starting your tests with a meaningfull subset of these files. That subset may contain OSD.* + PG* + PrimaryLog*.

Project Goals

Note: start with a subset of Ceph code. The OSD code might be a good selection.

make sure that ceph compiles under clang
tune up clang-tidy to find important issues that are common to Ceph (looking for a relatively small subset of critical issues);
document and justify the set of tests you have selected;
cleanup issues found in (2), by either:
- submitting PRs for real issues that should be fixed, or
- annotating the code so that clang-tidy ignores a false positive;
  Note: the set of tests you have selected should not (after step 4) result in false-positives
- tuning clang-tidy so that it won't emit "false positives";
add to jenkins/github actions (those should be marked as non blocking)