Skip to content

Instantly share code, notes, and snippets.

@connornishijima
Created January 22, 2025 15:39
Show Gist options
  • Select an option

  • Save connornishijima/414e48e73cc9b01d3c7cd40cc2d1f7fc to your computer and use it in GitHub Desktop.

Select an option

Save connornishijima/414e48e73cc9b01d3c7cd40cc2d1f7fc to your computer and use it in GitHub Desktop.
SPF - A simple and powerful profiler for Espressif microcontrollers by Connor Nishijima
// -------------------------------------------------------------------
#define PROFILER_ENABLED 1 // ENABLED = MINOR OVERHEAD!
/* _______ _______ _______
| | | | | |
| _____| | _ | | ___|
| |_____ | |_| | | |___
|_____ | | ___| | ___|
_____| | | | | |
|_______| |___| |___|
##############################################################
CONNOR NISHIJIMA'S WILDLY EASY *AND* POWERFUL ESP32 PROFILER
Released under the GPLv3 license. Enjoy!
##############################################################
How much CPU % does each section of your code cost?
This profiler can tell you in 3 simple steps.
1. Include this file in your project
2. Write "spf()" at the top of functions you want to measure,
and "epf()" at the bottom.
3. Call "init_profiler();" once at boot to start the profiler task
That's it! The profiler will automatically print a report every
3 seconds over UART with the CPU USAGE % of all currently
measured functions.
SPF reports print exactly like this:
--- PROFILER ------------------------------------------
CPU % | FUNCTION NAME | FILE:LINE
--------+----------------------+-----------------------
00.749% | run_particles | src/particles.h:32
00.021% | count_fps | src/main.cpp:76
21.699% | draw_lines | src/main.cpp:650
77.448% | RENDERING BEAM | src/main.cpp:687
00.084% | loop | src/main.cpp:810
The names and locations of your functions are automatically detected by
the compiler, and you can even measure arbitrary subsections of code
with spf_named( name ) like this:
spf_named( "RENDERING BEAM" );
// Do work here
epf();
This is a simple but efficient profiler for any Espressif microcontroller
This file should be compatible with Arduino and ESP-IDF 5.x out of box,
since it uses only IDF and C standard libraries.
It returns the relative percentage of CPU cycles taken by all measured
functions, and is designed to have microscopic overhead with clever memory
use. When disabled, the compiler will replace all function calls to profiler
functions with empty variants that consume no program space or CPU time.
--- FUNCTIONS ----------------------------------------------------------------------
- init_profiler():
Call once at boot to prepare memory and start the profiler task
- spf():
"Start Profiling" - Write this at the top of all functions that
you want to measure, including your main loop. This automatically
stores the current function name, name of the source code file,
the line number, and the parent function which called it. More
on that later.
- epf():
"End Profiling" - Write this at the bottom of every measured
function to properly close them
- spf_named( name ):
"Start profile on custom named subsection of a function"
Works exactly like spf(), but you can override the name
and use it freely within subsections of code.
ALSO REQUIRES A CLOSING epf() call of it's own afterwards
--- HOW IT WORKS (COMPILER) --------------------------------------------------------
When you build your code, the C Preprocessor will find all instances of
"spf()" that you wrote and automatically replace them with this:
start_profile(__func__, __FILE__, __LINE__, __COUNTER__);
The __func__, __FILE__, __LINE__, and __COUNTER__ are special C macros.
Importantly, __func__ is always a const char* of the exact name of the function
in the source code where __func__ was written and __COUNTER__ is replaced with
a unique integer for every single occurance in your code.
For example, a "sketch.ino" file like this:
1 void my_function(){
2 spf();
3 // Custom code to measure
4 epf();
5 }
6
7 void my_other_function(){
8 spf();
9 // Custom code to measure
10 epf();
11 }
12
13 void setup(){
14 init_profiler();
15 my_function();
16 my_other_function();
17 }
...gets automatically converted into this before being uploaded to the chip:
1 void my_function(){
2 start_profile("my_function", "sketch.ino", 2, 0); // Line 2, Counter 0
3 // Custom code to measure
4 end_profile();
5 }
6
7 void my_other_function(){
8 start_profile("my_other_function", "sketch.ino", 8, 1); // Line 8, Counter 1
9 // Custom code to measure
10 end_profile();
11 }
12
13 void setup(){
14 init_profiler();
15 my_function();
16 my_other_function();
17 }
This way you can just re-use the same "spf()" and "epf()" lines
everywhere without redundantly entering function names while you work.
--- HOW IT WORKS: (RUNTIME) --------------------------------------------------------
At boot, [core_to_run_profiler_on] is given a task where it samples the
"currently_executing_section" at around 1000Hz.
Every time you call spf(), the previous value of currently_executing_section
is pushed to a stack, and it's overwritten by the value of __COUNTER__ in the
underlying "start_profile()" behind the macro. __COUNTER__ is used in this way
to automatically give each profiled section of code a unique index in the stack.
Every time you call epf(), the currently_executed_section is returned
to the previous value in the stack, or in other words: returning from this
function back to the parent who called it.
During runtime, every [profile_print_interval_ms] milliseconds, the profiler
will automatically print out a report of the execution time of all measured functions:
--- PROFILER ------------------------------------------
CPU % | FUNCTION NAME | FILE:LINE
--------+----------------------+-----------------------
00.749% | run_particles | src/particles.h:32
00.021% | count_fps | src/main.cpp:76
21.699% | draw_lines | src/main.cpp:650
77.448% | RENDERING BEAM | src/main.cpp:687
00.084% | loop | src/main.cpp:810
In this case, a function called "draw_lines()" is taking up 22.85% of all
CPU cycles, and a custom named subsection "RENDERING BEAM" is taking up
76.34% of all CPU cycles. The remaining time is spent in menial functions
that don't cost many cycles themselves, such as the loop() which immediately
calls "draw_particle_lines()" and jumps away from itself upon starting.
*/
// For uint32_t, uint16_t, etc.
#include <stdint.h>
// For strncpy, strlen, etc.
#include <string.h>
// For printf, etc.
#include <stdio.h>
// For delaying and tasks
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
// Default to running profiler on opposite core of Arduino, if Arduino
#ifdef ARDUINO_RUNNING_CORE
#define DEFAULT_CORE_TO_RUN_PROFILER_ON (!ARDUINO_RUNNING_CORE)
#else
#define DEFAULT_CORE_TO_RUN_PROFILER_ON 0 // Default to core 0
#endif
// Maximum number of "spf()" occurrences in your code at a time, costs 40 bytes per function
#define MAX_PROFILED_FUNCTIONS 32 // 32 = 1.28K
// 64 = 2.56K
// 128 = 5.12K of RAM
// Print a report every 3 seconds
#define DEFAULT_PROFILE_PRINT_INTERVAL_MS 3000
// INSTRUCTIONS FOR THE COMPILER:
// FIND this ----> REPLACE with this
#define spf() start_profile( __func__, __FILE__, __LINE__, __COUNTER__ )
#define spf_named( s ) start_profile( s, __FILE__, __LINE__, __COUNTER__ )
#define epf() end_profile()
#define MIN(a,b) (((a)<(b))?(a):(b))
// If the profiler is disabled, all related functions are left empty instead
// A modern compiler likely realizes these are empty and simply removes
// them from existence so that no CPU cycles are spent in assembly jumping
// in and out of them.
#if PROFILER_ENABLED == 0
inline void start_profile(const char *func, const char *file, uint32_t line, uint32_t counter){}
inline void end_profile(){}
void init_profiler(){}
// Everything within this only exists when the profiler is enabled
#elif PROFILER_ENABLED == 1
// Updated dynamically with __COUNTER__ every time
// that spf() and epf() occur in the code:
volatile int32_t currently_executing_section = 0;
// Can be updated later in init_profiler()
uint32_t profile_print_interval_ms = DEFAULT_PROFILE_PRINT_INTERVAL_MS;
// Functions are tracked with these structures
typedef struct {
char func[16]; // Function name
char file[16]; // File it's in
uint16_t line; // Line it's on
uint16_t parent; // Function which called it
uint32_t hits; // Number of times it's been spotted running
} profile_t;
// The record of all measured functions
static profile_t profile_pile[MAX_PROFILED_FUNCTIONS];
uint32_t get_ms(){
return xTaskGetTickCount() * portTICK_PERIOD_MS;
}
// The task that prints out the profiler data from the opposite core
void profiler_task(void *pvParameters){
while(1){
// Wait one millisecond
vTaskDelay(pdMS_TO_TICKS(1));
// If we're not before the first call to spf(),
// mark the current section as having been seen
if(currently_executing_section != 0){
profile_pile[currently_executing_section].hits++;
}
// Check if enouch time has passed to print the next profiler report
static uint32_t last_profile_print_ms = get_ms();
uint32_t current_time_ms = get_ms();
bool print_report = (current_time_ms - last_profile_print_ms >= profile_print_interval_ms);
// If it's time to print the report, do so
if(print_report == true){
last_profile_print_ms = current_time_ms;
// Calculate the total number of hits
uint32_t total_hits = 0;
for(uint32_t i = 0; i < MAX_PROFILED_FUNCTIONS; i++){
total_hits += profile_pile[i].hits;
}
// Print all functions as a percentage of total hits
printf("--- SPF PROFILER --------------------------------------\n");
printf("CPU % | FUNCTION NAME | FILE:LINE\n");
printf("--------+----------------------+-----------------------\n");
for(uint32_t i = 0; i < MAX_PROFILED_FUNCTIONS; i++){
if(profile_pile[i].hits > 0){
float cpu_usage = (profile_pile[i].hits / (float)total_hits) * 100.0;
printf(
"%06.3f%% | %-20s | %s:%d\n",
cpu_usage,
profile_pile[i].func,
profile_pile[i].file,
profile_pile[i].line
);
}
}
printf("-------------------------------------------------------\n");
}
}
}
// Start profiling a function
// This is filled out automatically and called by the spf() macro
inline void start_profile(const char *func, const char *file, uint32_t line, uint32_t counter){
if(counter < MAX_PROFILED_FUNCTIONS){
if(line != profile_pile[counter].line){
strncpy(profile_pile[counter].func, func, 15);
profile_pile[counter].func[15] = '\0'; // Ensure null termination
strncpy(profile_pile[counter].file, file, 15);
profile_pile[counter].file[15] = '\0'; // Ensure null termination
profile_pile[counter].line = line;
}
profile_pile[counter].parent = currently_executing_section;
currently_executing_section = counter;
}
}
// End profiling a function
// This is called by the epf() macro
inline void end_profile(){
currently_executing_section = profile_pile[currently_executing_section].parent;
}
// Run this once at boot to prepare memory and start the profiler task
void init_profiler(uint32_t print_interval_ms = DEFAULT_PROFILE_PRINT_INTERVAL_MS, uint32_t core_to_run_profiler_on = DEFAULT_CORE_TO_RUN_PROFILER_ON){
memset(profile_pile, 0, sizeof(profile_pile));
// Create a task to print out the profiler data
profile_print_interval_ms = print_interval_ms;
xTaskCreatePinnedToCore(
profiler_task,
"profiler_task",
4096,
NULL,
1,
NULL,
core_to_run_profiler_on
);
}
#endif
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment