Created
January 22, 2025 15:39
-
-
Save connornishijima/414e48e73cc9b01d3c7cd40cc2d1f7fc to your computer and use it in GitHub Desktop.
SPF - A simple and powerful profiler for Espressif microcontrollers by Connor Nishijima
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| // ------------------------------------------------------------------- | |
| #define PROFILER_ENABLED 1 // ENABLED = MINOR OVERHEAD! | |
| /* _______ _______ _______ | |
| | | | | | | | |
| | _____| | _ | | ___| | |
| | |_____ | |_| | | |___ | |
| |_____ | | ___| | ___| | |
| _____| | | | | | | |
| |_______| |___| |___| | |
| ############################################################## | |
| CONNOR NISHIJIMA'S WILDLY EASY *AND* POWERFUL ESP32 PROFILER | |
| Released under the GPLv3 license. Enjoy! | |
| ############################################################## | |
| How much CPU % does each section of your code cost? | |
| This profiler can tell you in 3 simple steps. | |
| 1. Include this file in your project | |
| 2. Write "spf()" at the top of functions you want to measure, | |
| and "epf()" at the bottom. | |
| 3. Call "init_profiler();" once at boot to start the profiler task | |
| That's it! The profiler will automatically print a report every | |
| 3 seconds over UART with the CPU USAGE % of all currently | |
| measured functions. | |
| SPF reports print exactly like this: | |
| --- PROFILER ------------------------------------------ | |
| CPU % | FUNCTION NAME | FILE:LINE | |
| --------+----------------------+----------------------- | |
| 00.749% | run_particles | src/particles.h:32 | |
| 00.021% | count_fps | src/main.cpp:76 | |
| 21.699% | draw_lines | src/main.cpp:650 | |
| 77.448% | RENDERING BEAM | src/main.cpp:687 | |
| 00.084% | loop | src/main.cpp:810 | |
| The names and locations of your functions are automatically detected by | |
| the compiler, and you can even measure arbitrary subsections of code | |
| with spf_named( name ) like this: | |
| spf_named( "RENDERING BEAM" ); | |
| // Do work here | |
| epf(); | |
| This is a simple but efficient profiler for any Espressif microcontroller | |
| This file should be compatible with Arduino and ESP-IDF 5.x out of box, | |
| since it uses only IDF and C standard libraries. | |
| It returns the relative percentage of CPU cycles taken by all measured | |
| functions, and is designed to have microscopic overhead with clever memory | |
| use. When disabled, the compiler will replace all function calls to profiler | |
| functions with empty variants that consume no program space or CPU time. | |
| --- FUNCTIONS ---------------------------------------------------------------------- | |
| - init_profiler(): | |
| Call once at boot to prepare memory and start the profiler task | |
| - spf(): | |
| "Start Profiling" - Write this at the top of all functions that | |
| you want to measure, including your main loop. This automatically | |
| stores the current function name, name of the source code file, | |
| the line number, and the parent function which called it. More | |
| on that later. | |
| - epf(): | |
| "End Profiling" - Write this at the bottom of every measured | |
| function to properly close them | |
| - spf_named( name ): | |
| "Start profile on custom named subsection of a function" | |
| Works exactly like spf(), but you can override the name | |
| and use it freely within subsections of code. | |
| ALSO REQUIRES A CLOSING epf() call of it's own afterwards | |
| --- HOW IT WORKS (COMPILER) -------------------------------------------------------- | |
| When you build your code, the C Preprocessor will find all instances of | |
| "spf()" that you wrote and automatically replace them with this: | |
| start_profile(__func__, __FILE__, __LINE__, __COUNTER__); | |
| The __func__, __FILE__, __LINE__, and __COUNTER__ are special C macros. | |
| Importantly, __func__ is always a const char* of the exact name of the function | |
| in the source code where __func__ was written and __COUNTER__ is replaced with | |
| a unique integer for every single occurance in your code. | |
| For example, a "sketch.ino" file like this: | |
| 1 void my_function(){ | |
| 2 spf(); | |
| 3 // Custom code to measure | |
| 4 epf(); | |
| 5 } | |
| 6 | |
| 7 void my_other_function(){ | |
| 8 spf(); | |
| 9 // Custom code to measure | |
| 10 epf(); | |
| 11 } | |
| 12 | |
| 13 void setup(){ | |
| 14 init_profiler(); | |
| 15 my_function(); | |
| 16 my_other_function(); | |
| 17 } | |
| ...gets automatically converted into this before being uploaded to the chip: | |
| 1 void my_function(){ | |
| 2 start_profile("my_function", "sketch.ino", 2, 0); // Line 2, Counter 0 | |
| 3 // Custom code to measure | |
| 4 end_profile(); | |
| 5 } | |
| 6 | |
| 7 void my_other_function(){ | |
| 8 start_profile("my_other_function", "sketch.ino", 8, 1); // Line 8, Counter 1 | |
| 9 // Custom code to measure | |
| 10 end_profile(); | |
| 11 } | |
| 12 | |
| 13 void setup(){ | |
| 14 init_profiler(); | |
| 15 my_function(); | |
| 16 my_other_function(); | |
| 17 } | |
| This way you can just re-use the same "spf()" and "epf()" lines | |
| everywhere without redundantly entering function names while you work. | |
| --- HOW IT WORKS: (RUNTIME) -------------------------------------------------------- | |
| At boot, [core_to_run_profiler_on] is given a task where it samples the | |
| "currently_executing_section" at around 1000Hz. | |
| Every time you call spf(), the previous value of currently_executing_section | |
| is pushed to a stack, and it's overwritten by the value of __COUNTER__ in the | |
| underlying "start_profile()" behind the macro. __COUNTER__ is used in this way | |
| to automatically give each profiled section of code a unique index in the stack. | |
| Every time you call epf(), the currently_executed_section is returned | |
| to the previous value in the stack, or in other words: returning from this | |
| function back to the parent who called it. | |
| During runtime, every [profile_print_interval_ms] milliseconds, the profiler | |
| will automatically print out a report of the execution time of all measured functions: | |
| --- PROFILER ------------------------------------------ | |
| CPU % | FUNCTION NAME | FILE:LINE | |
| --------+----------------------+----------------------- | |
| 00.749% | run_particles | src/particles.h:32 | |
| 00.021% | count_fps | src/main.cpp:76 | |
| 21.699% | draw_lines | src/main.cpp:650 | |
| 77.448% | RENDERING BEAM | src/main.cpp:687 | |
| 00.084% | loop | src/main.cpp:810 | |
| In this case, a function called "draw_lines()" is taking up 22.85% of all | |
| CPU cycles, and a custom named subsection "RENDERING BEAM" is taking up | |
| 76.34% of all CPU cycles. The remaining time is spent in menial functions | |
| that don't cost many cycles themselves, such as the loop() which immediately | |
| calls "draw_particle_lines()" and jumps away from itself upon starting. | |
| */ | |
| // For uint32_t, uint16_t, etc. | |
| #include <stdint.h> | |
| // For strncpy, strlen, etc. | |
| #include <string.h> | |
| // For printf, etc. | |
| #include <stdio.h> | |
| // For delaying and tasks | |
| #include "freertos/FreeRTOS.h" | |
| #include "freertos/task.h" | |
| // Default to running profiler on opposite core of Arduino, if Arduino | |
| #ifdef ARDUINO_RUNNING_CORE | |
| #define DEFAULT_CORE_TO_RUN_PROFILER_ON (!ARDUINO_RUNNING_CORE) | |
| #else | |
| #define DEFAULT_CORE_TO_RUN_PROFILER_ON 0 // Default to core 0 | |
| #endif | |
| // Maximum number of "spf()" occurrences in your code at a time, costs 40 bytes per function | |
| #define MAX_PROFILED_FUNCTIONS 32 // 32 = 1.28K | |
| // 64 = 2.56K | |
| // 128 = 5.12K of RAM | |
| // Print a report every 3 seconds | |
| #define DEFAULT_PROFILE_PRINT_INTERVAL_MS 3000 | |
| // INSTRUCTIONS FOR THE COMPILER: | |
| // FIND this ----> REPLACE with this | |
| #define spf() start_profile( __func__, __FILE__, __LINE__, __COUNTER__ ) | |
| #define spf_named( s ) start_profile( s, __FILE__, __LINE__, __COUNTER__ ) | |
| #define epf() end_profile() | |
| #define MIN(a,b) (((a)<(b))?(a):(b)) | |
| // If the profiler is disabled, all related functions are left empty instead | |
| // A modern compiler likely realizes these are empty and simply removes | |
| // them from existence so that no CPU cycles are spent in assembly jumping | |
| // in and out of them. | |
| #if PROFILER_ENABLED == 0 | |
| inline void start_profile(const char *func, const char *file, uint32_t line, uint32_t counter){} | |
| inline void end_profile(){} | |
| void init_profiler(){} | |
| // Everything within this only exists when the profiler is enabled | |
| #elif PROFILER_ENABLED == 1 | |
| // Updated dynamically with __COUNTER__ every time | |
| // that spf() and epf() occur in the code: | |
| volatile int32_t currently_executing_section = 0; | |
| // Can be updated later in init_profiler() | |
| uint32_t profile_print_interval_ms = DEFAULT_PROFILE_PRINT_INTERVAL_MS; | |
| // Functions are tracked with these structures | |
| typedef struct { | |
| char func[16]; // Function name | |
| char file[16]; // File it's in | |
| uint16_t line; // Line it's on | |
| uint16_t parent; // Function which called it | |
| uint32_t hits; // Number of times it's been spotted running | |
| } profile_t; | |
| // The record of all measured functions | |
| static profile_t profile_pile[MAX_PROFILED_FUNCTIONS]; | |
| uint32_t get_ms(){ | |
| return xTaskGetTickCount() * portTICK_PERIOD_MS; | |
| } | |
| // The task that prints out the profiler data from the opposite core | |
| void profiler_task(void *pvParameters){ | |
| while(1){ | |
| // Wait one millisecond | |
| vTaskDelay(pdMS_TO_TICKS(1)); | |
| // If we're not before the first call to spf(), | |
| // mark the current section as having been seen | |
| if(currently_executing_section != 0){ | |
| profile_pile[currently_executing_section].hits++; | |
| } | |
| // Check if enouch time has passed to print the next profiler report | |
| static uint32_t last_profile_print_ms = get_ms(); | |
| uint32_t current_time_ms = get_ms(); | |
| bool print_report = (current_time_ms - last_profile_print_ms >= profile_print_interval_ms); | |
| // If it's time to print the report, do so | |
| if(print_report == true){ | |
| last_profile_print_ms = current_time_ms; | |
| // Calculate the total number of hits | |
| uint32_t total_hits = 0; | |
| for(uint32_t i = 0; i < MAX_PROFILED_FUNCTIONS; i++){ | |
| total_hits += profile_pile[i].hits; | |
| } | |
| // Print all functions as a percentage of total hits | |
| printf("--- SPF PROFILER --------------------------------------\n"); | |
| printf("CPU % | FUNCTION NAME | FILE:LINE\n"); | |
| printf("--------+----------------------+-----------------------\n"); | |
| for(uint32_t i = 0; i < MAX_PROFILED_FUNCTIONS; i++){ | |
| if(profile_pile[i].hits > 0){ | |
| float cpu_usage = (profile_pile[i].hits / (float)total_hits) * 100.0; | |
| printf( | |
| "%06.3f%% | %-20s | %s:%d\n", | |
| cpu_usage, | |
| profile_pile[i].func, | |
| profile_pile[i].file, | |
| profile_pile[i].line | |
| ); | |
| } | |
| } | |
| printf("-------------------------------------------------------\n"); | |
| } | |
| } | |
| } | |
| // Start profiling a function | |
| // This is filled out automatically and called by the spf() macro | |
| inline void start_profile(const char *func, const char *file, uint32_t line, uint32_t counter){ | |
| if(counter < MAX_PROFILED_FUNCTIONS){ | |
| if(line != profile_pile[counter].line){ | |
| strncpy(profile_pile[counter].func, func, 15); | |
| profile_pile[counter].func[15] = '\0'; // Ensure null termination | |
| strncpy(profile_pile[counter].file, file, 15); | |
| profile_pile[counter].file[15] = '\0'; // Ensure null termination | |
| profile_pile[counter].line = line; | |
| } | |
| profile_pile[counter].parent = currently_executing_section; | |
| currently_executing_section = counter; | |
| } | |
| } | |
| // End profiling a function | |
| // This is called by the epf() macro | |
| inline void end_profile(){ | |
| currently_executing_section = profile_pile[currently_executing_section].parent; | |
| } | |
| // Run this once at boot to prepare memory and start the profiler task | |
| void init_profiler(uint32_t print_interval_ms = DEFAULT_PROFILE_PRINT_INTERVAL_MS, uint32_t core_to_run_profiler_on = DEFAULT_CORE_TO_RUN_PROFILER_ON){ | |
| memset(profile_pile, 0, sizeof(profile_pile)); | |
| // Create a task to print out the profiler data | |
| profile_print_interval_ms = print_interval_ms; | |
| xTaskCreatePinnedToCore( | |
| profiler_task, | |
| "profiler_task", | |
| 4096, | |
| NULL, | |
| 1, | |
| NULL, | |
| core_to_run_profiler_on | |
| ); | |
| } | |
| #endif |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment