Skip to content

Instantly share code, notes, and snippets.

@retpolanne
Last active March 23, 2024 19:26
Show Gist options
  • Save retpolanne/f7f27e6bad2f1a23176fce4e754aade7 to your computer and use it in GitHub Desktop.
Save retpolanne/f7f27e6bad2f1a23176fce4e754aade7 to your computer and use it in GitHub Desktop.
GSoC Linux Foundation proposal

Chosen project

perf - Scalability and speed

Verbatim:

complexity: intermediate or hard

duration: small, medium or large

requirements: machine to work and test on, typically a bare metal (ie not cloud) Linux machine. C programming, multi-threading/pthread library.

The perf tool is largely single threaded even though sometimes it needs to do something on every CPU in the system. This is embarrassingly parallel but the tool isn't exploiting it. Work was done to create a work pool mechanism but not merged due to latent bugs in memory management. Address sanitizer and reference count checking have solved this problem but we still need to integrate the work pool code.

Another improvement is that currently the perf report command will process an entire perf.data file before providing a visualization. This can be slow for large perf.data files. In contrast, the perf top command will gather data in the background while providing a visualization. Breaking apart the perf report command so that processing is performed on a background thread with the visualization periodically refreshing in the foreground will mean that at least during the slow load the user can do something.

One more thing can do is to reduce the number of file descriptors in perf record with –threads option. Currently it needs a couple of pipes to communicate between the worker threads. I think it can be greatly reduced by using eventfd(2) instead of having pipes for each thread.

Proposal: work on reporting optimization. I'll mostly work on the perf report improvements so that we can can load big perf.data reports without too much hasle. Also work on other kinds of reporting optimizations that are suggested by the mentors.

You

Anne Isabelle Macedo

Systems Engineer @ Nubank

I'm currently a systems engineer, but I have worked before as a software developer and infrastructure engineer. I've been contributing on and off to open source software since 2019 and since 2022 I started to look into the Linux Kernel.

IRC: retpolanne

Coding Skills: I mainly code in Golang, Bash, C and Python, started learning Rust.

In your application let us know

  • What platform do you use to code? Hardware specifications and operating system

MacBook Pro M3 Pro (10 core aarch64 + 18GB) + macOS + Debian running on top of QEMU

  • Did you ever code in C or C++/Perl/python/…, yes/no? what is your experience?

Yes, I did submit some patches to the kernel in C but they were rejected. I have contributed to Python's PIP, Ansible, Kubernetes and a few other in Python and Golang.

You and Us

  • Were you involved in development in the project's group in the past? What was your contribution?

No

  • Were you involved in other OpenSource development projects in the past? which, when and in what role?

Yes, I occasionally contributed to Kubernetes related repos, also Python's PIP

  • Why have you chosen your development idea and what do you expect from your implementation?

I think that perf reporting can be a good entrypoint for other things related to perf. Improving those will help me familiarize with the tool and find other areas to improve and better usages of perf.

Your Project

  • What do you want to achieve?

I want to get more hands on experience with contributing to perf on the Linux ecosystem, and help improve perf's usability and support, finding better usage opportunities to it.

  • If you have chosen an idea from our list, why did you choose this specific idea?

I think perf reporting is a great entrypoint for the rest of the perf and Linux observability. This will help me get a good grasp of the concepts behind perf so I can start to improve those as well. Also, given time contraints, I think that's a project that I can commit to due to not being as complex and time consuming as other tasks.

  • How much time do you plan to invest in the project before, during and after the Summer of Code (We expect full time 40h/week during GSoC, but better make this explicit)?

Before: I'm currently working on studying about perf and BPF mostly on weekdays, 6-10pm

During: I'll commit to working on perf about 20h/week (same 6-10pm schedule), but I can stretch to weekends.

After: I plan on continuing my journey with perf and BPF during work hours (we have some cases where perf may be needed) and after hours, but I won't commit to a specific timeframe after.

  • Please provide a schedule of how this time will be spent on subtasks of the project. While this is only preliminary, you will be required to provide a detailed plan latest at the beginning of GSoC and during the project you will issue weekly progress reports against that plan.

Most of the setup for development is already done, so I should get straight to coding in the first week. First 3 days I'll spend running perf and trying to replicate slowness with perf reporting and figure out code paths that are relevant. From there I should start working on submitting patches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment